Hadoop环境的主机名解析
问题背景
在网络里,每台机器都有两个身份证。人出于管理的目的,颁发的ID是主机名。而在“机器世界”中,相互之间的交流却基于IP地址,因而在人不感知的角落,每时每刻都在进行着成千上万次的转换,主机名–> IP地址。提到解析,就绕不开“专业人士” DNS。作为久经考验的互联网设施,在和大数据结为一对时,却也像新婚的小夫妻难免磕磕碰碰,互相嫌弃。要么是解析慢了,影响大数据响应。要么是DNS抱怨家务活太多,压力太大。还有就是已经注释了DNS 的配置,结果还要合DNS纠缠不清。两者真的是八字不合?
问题样例
-
问题一 DNS解析慢影响大数据性能
该问题主要发生在与连接建立阶段,卡顿10-15s的情况。如HDFS客户端的如下日志打印和jstack信息:
# 日志打印: 2021-04-22 15:15:01,038 | DEBUG | pool-13-thread-165 | Connecting to 100.76.18.13/100.76.18.13:25000 | org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:814) .... 2021-04-22 15:15:14,672 | DEBUG | pool-13-thread-165 | PrivilegedAction as:hive/hadoop.hadoop.com@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:829) | org.apache.hadoop.security.UserGroupInformation.logPrivilegedAction(UserGroupInformation.java:1756) # jstack信息: "pool-13-thread-165" #2003 prio=5 os_prio=0 tid=0x00002b56846d0800 nid=0x4bb9 runnable [0x00002b56c1d4b000] java.lang.Thread.State: RUNNABLE at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at java.net.InetAddress.getByName(InetAddress.java:1077) at org.apache.hadoop.security.SecurityUtil$QualifiedHostResolver.getInetAddressByName(SecurityUtil.java:684) at org.apache.hadoop.security.SecurityUtil$QualifiedHostResolver.getByExactName(SecurityUtil.java:657) at org.apache.hadoop.security.SecurityUtil$QualifiedHostResolver.getByName(SecurityUtil.java:625) at org.apache.hadoop.security.SecurityUtil.getByName(SecurityUtil.java:549) at org.apache.hadoop.net.NetUtils.getLocalInetAddress(NetUtils.java:684) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:701) - locked <0x000000060130ef08> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:823) - locked <0x000000060130ef08> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:436) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1613)
-
问题二 大数据DDOS攻击DNS服务器
DNS 服务器压力太大,一查发现有1000+的节点在解析不存在的条目。难道被DDOS攻击了?
31-MAR-2021 17:56:35.464 queries: client x.x.x.x#55638: query: manager IN A +
通过日志记录发现都是大数据节点发起的解析请求。
-
问题三 如何正确的关闭DNS
在集群中配置了DNS,影响到了业务的请求。注释了/etc/resolve.conf
#nameserver 170.4.0.101
修改了/etc/nssswitch.conf
hosts: files dns myhostname --> hosts: files myhostname
本地也没有启动bind服务,为何还是需要通过DNS服务进行解析。
原理分析
大数据平台主要在两个场景下,需要进行DNS的解析。一是将集群内的节点名,转换为IP地址。二是在配置了kerberos之后,安全认证过程中进行Host 的转换。根据经验,场景一由于主机名固定且加入了/etc/hosts,一般不会通过DNS进行主机名解析。而场景二中,主机名不固定,可能存在漏加或错加的情况,问题较为多发。
-
场景二中,在kerberos环境下,与服务端通过SASL协议建立连接时,需要从客户端的kerberos principal中拿到主机名然后获取本地的地址信息。
org.apache.hadoop.ipc.Conenction#setupConnection /* * Bind the socket to the host specified in the principal name of the * client, to ensure Server matching address of the client connection * to host name in principal passed. */ InetSocketAddress bindAddr = null; if (ticket != null && ticket.hasKerberosCredentials()) { KerberosInfo krbInfo = remoteId.getProtocol().getAnnotation(KerberosInfo.class); if (krbInfo != null) { String principal = ticket.getUserName(); String host = SecurityUtil.getHostFromPrincipal(principal); // If host name is a valid local address then bind socket to it InetAddress localAddr = NetUtils.getLocalInetAddress(host); if (localAddr != null) { this.socket.setReuseAddress(true); localAddr = NetUtils.bindToLocalAddress(localAddr, bindToWildCardAddress); LOG.debug("Binding {} to {}", principal, (bindToWildCardAddress) ? "0.0.0.0" : localAddr); this.socket.bind(new InetSocketAddress(localAddr, 0)); } } }
为何要根据principal 去获取地址信息?该改动是在开源单HADOOP-7215 中引入的,由于在HADOOP-7104中增加了客户端主机名和客户端 kerberos principal中主机名比对。为避免比对失败,客户端不应使用随机的IP地址,而应使用kerberos principal 中主机名字段对应的IP地址。
HADOOP-7104 introduced a change where RPC server matches client's hostname with the hostname specified in the client's Kerberos principal name. RPC client binds the socket to a random local address, which might not match the hostname specified in the principal name. This results authorization failure of the client at the server.
-
NetUtils#getLocalInetAddress 中,获取主机名对应地址并检查是否为本地地址。非本地地址时返回null使用随机地址。
/** * Checks if {@code host} is a local host name and return {@link InetAddress} * corresponding to that address. * * @param host the specified host * @return a valid local {@link InetAddress} or null * @throws SocketException if an I/O error occurs */ public static InetAddress getLocalInetAddress(String host) throws SocketException { if (host == null) { return null; } InetAddress addr = null; try { addr = SecurityUtil.getByName(host); if (NetworkInterface.getByInetAddress(addr) == null) { addr = null; // Not a local address } } catch (UnknownHostException ignore) { } return addr; }
-
getByName 时根据配置hadoop.security.token.service.use_ip (HADOOP-7808)来决定如何解析主机名。
/** * Resolves a host subject to the security requirements determined by * hadoop.security.token.service.use_ip. Optionally logs slow resolutions. * * @param hostname host or ip to resolve * @return a resolved host * @throws UnknownHostException if the host doesn't exist */ @InterfaceAudience.Private public static InetAddress getByName(String hostname) throws UnknownHostException { if (logSlowLookups || LOG.isTraceEnabled()) { StopWatch lookupTimer = new StopWatch().start(); InetAddress result = hostResolver.getByName(hostname); long elapsedMs = lookupTimer.stop().now(TimeUnit.MILLISECONDS); if (elapsedMs >= slowLookupThresholdMs) { LOG.warn("Slow name lookup for " + hostname + ". Took " + elapsedMs + " ms."); } else if (LOG.isTraceEnabled()) { LOG.trace("Name lookup for " + hostname + " took " + elapsedMs + " ms."); } return result; } else { return hostResolver.getByName(hostname); } }
当前存在两个解析类,StandardHostResolver 和QualifiedHostResolver ,前者为标准的java主机名解析,而后者在解析性能和安全层面进行了加强。
QualifiedHostResolver /** * This an alternate resolver with important properties that the standard * java resolver lacks: * 1) The hostname is fully qualified. This avoids security issues if not * all hosts in the cluster do not share the same search domains. It * also prevents other hosts from performing unnecessary dns searches. * In contrast, InetAddress simply returns the host as given. * 2) The InetAddress is instantiated with an exact host and IP to prevent * further unnecessary lookups. InetAddress may perform an unnecessary * reverse lookup for an IP. * 3) A call to getHostName() will always return the qualified hostname, or * more importantly, the IP if instantiated with an IP. This avoids * unnecessary dns timeouts if the host is not resolvable. * 4) Point 3 also ensures that if the host is re-resolved, ex. during a * connection re-attempt, that a reverse lookup to host and forward * lookup to IP is not performed since the reverse/forward mappings may * not always return the same IP. If the client initiated a connection * with an IP, then that IP is all that should ever be contacted. * * NOTE: this resolver is only used if: * hadoop.security.token.service.use_ip=false */
对传入的参数进行了标准化处理后,调用了对应的方法进程解析。如hadoop.hadoop.com,则会在末尾增加. 成为FQDN,避免search domain 不一致带来的安全问题或不必要的DNS解析。
InetAddress getByExactName(String host) { InetAddress addr = null; // InetAddress will use the search list unless the host is rooted // with a trailing dot. The trailing dot will disable any use of the // search path in a lower level resolver. See RFC 1535. String fqHost = host; if (!fqHost.endsWith(".")) fqHost += "."; try { addr = getInetAddressByName(fqHost); // can't leave the hostname as rooted or other parts of the system // malfunction, ex. kerberos principals are lacking proper host // equivalence for rooted/non-rooted hostnames addr = InetAddress.getByAddress(host, addr.getAddress()); } catch (UnknownHostException e) { // ignore, caller will throw if necessary } return addr; }
-
最终调用了JDK的native方法Inet6AddressImpl或Ine4AddressImpl#lookupAllHostAddr解析主机名。根据JDK的代码,该方法调用标准调用 getaddrinfo 。
memset(&hints, 0, sizeof(hints)); hints.ai_flags = AI_CANONNAME; hints.ai_family = AF_UNSPEC; error = getaddrinfo(hostname, NULL, &hints, &res); last_i = gaih_inet (name, pservice, hints, end, &naddrs, &tmpbuf); int err = __nscd_getai (name, &air, &h_errno); fct4 = __nss_lookup_function (nip, "gethostbyname4_r"); _nss_files_gethostbyname4_r _nss_dns_gethostbyname4_r
getaddrinfo 是glibc 中的标准调用,封装了gethostbyname 和 getservbyname 两个系统调用,屏蔽了IPv4 IPv6的差异。
Given node and service, which identify an Internet host and a service, getaddrinfo() returns one or more addrinfo structures, each of which contains an Internet address that can be specified in a call to bind(2) or connect(2).The getaddrinfo() function combines the functionality provided by the gethostbyname(3) and getservbyname(3) functions into a single interface, but unlike the latter functions, getaddrinfo() is reentrant and allows programs to eliminate IPv4-versus-IPv6 dependencies.
-
为了解详细的解析过程,使用如下样例代码进行测试。
import java.net.InetAddress; import java.net.UnknownHostException; public class Test { public static void main(String[] args) { try { InetAddress addr = InetAddress.getByName("hadoop.hadoop.com."); System.out.println(addr); } catch (UnknownHostException e) { e.printStackTrace(); } } }
编译后,使用strace命令跟踪其调用过程:
javac Test.java strace -tt -T -f -i -s 4096 java Test >test.log 2>&1
-
根据strace结果(hosts中无记录,dns中无记录),先连接nscd进行hosts查询,耗时5秒:
[pid 10987] 16:24:06 connect(7, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = 0 <0.000012> [pid 10987] 16:24:06 sendto(7, "\2\0\0\0\r\0\0\0\6\0\0\0hosts\0", 18, MSG_NOSIGNAL, NULL, 0) = 18 <0.000010> [pid 10987] 16:24:06 poll([{fd=7, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ([{fd=7, revents=POLLIN|POLLHUP}]) <0.000009> [pid 10987] 16:24:06 recvmsg(7, {msg_name(0)=NULL, msg_iov(2)=[{"hosts\0", 6}, {"\270O\3\0\0\0\0\0", 8}], msg_controllen=24, {cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {8}}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 14 <0.000009> [pid 10987] 16:24:06 mmap(NULL, 217016, PROT_READ, MAP_SHARED, 8, 0) = 0x7f6e4199d000 <0.000010> [pid 10987] 16:24:06 socket(PF_FILE, 0x80801 /* SOCK_??? */, 0) = 7 <0.000009> [pid 10987] 16:24:06 connect(7, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = 0 <0.000011> [pid 10987] 16:24:06 sendto(7, "\2\0\0\0\4\0\0\0\23\0\0\0hadoop.hadoop.com.\0", 31, MSG_NOSIGNAL, NULL, 0) = 31 <0.000010> [pid 10987] 16:24:06 poll([{fd=7, events=POLLIN|POLLERR|POLLHUP}], 1, 5000 <unfinished ...> [pid 10987] 16:24:11 <... poll resumed> ) = 0 (Timeout) <5.001059> [pid 10987] 16:24:11 close(7) = 0 <0.000011>
读取nssswitch.conf ,先从hosts文件解析,解析失败后通过dns解析。
[pid 28793] 09:54:28.118467 [ 7f225b2f4eb0] open("/etc/nsswitch.conf", O_RDONLY) = 7 <0.000012> [pid 28793] 09:54:28.119931 [ 7f225b2f4ecd] open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 7 <0.000009>
读取resolv.conf,连接DNS查询两次(耗时10s)
[pid 10987] 16:24:11 [ 7f225b2f4eb0] open("/etc/resolv.conf", O_RDONLY) = 7 <0.000008> [pid 10987] 16:24:11 socket(PF_INET, 0x802 /* SOCK_??? */, IPPROTO_IP) = 7 <0.000011> [pid 10987] 16:24:11 connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("170.4.0.101")}, 28) = 0 <0.000010> [pid 10987] 16:24:11 poll([{fd=7, events=POLLOUT}], 1, 0) = 1 ([{fd=7, revents=POLLOUT}]) <0.000008> [pid 10987] 16:24:11 sendto(7, "\231{\1\0\0\1\0\0\0\0\0\0\6hadoop\6hadoop\3com\0\0"..., 35, MSG_NOSIGNAL, NULL, 0) = 35 <0.000017> [pid 10987] 16:24:11 poll([{fd=7, events=POLLIN}], 1, 5000 <unfinished ...> [pid 10987] 16:24:16 <... poll resumed> ) = 0 (Timeout) <5.003036> [pid 10987] 16:24:16 poll([{fd=7, events=POLLOUT}], 1, 0) = 1 ([{fd=7, revents=POLLOUT}]) <0.000009> [pid 10987] 16:24:16 sendto(7, "\231{\1\0\0\1\0\0\0\0\0\0\6hadoop\6hadoop\3com\0\0"..., 35, MSG_NOSIGNAL, NULL, 0) = 35 <0.000016> [pid 10987] 16:24:16 poll([{fd=7, events=POLLIN}], 1, 5000 <unfinished ...> [pid 10987] 16:24:21 <... poll resumed> ) = 0 (Timeout) <5.003870> [pid 10987] 16:24:21 close(7) = 0 <0.000015> [pid 10987] 16:24:21 lseek(3, 53540581, SEEK_SET) = 53540581 <0.000010>
-
整体的解析流程如下:
1)连接nscd 从缓存中进行解析。
2)尝试本地进行解析,根据nssswitch.conf 配置顺序进行尝试。
# nssswitch.conf hosts: files dns #为先hosts后连接dns解析 # reslov.conf options timeout: 5 # 5秒超时 attempts: 2 # 重试两次
-
另外一些老的操作系统,修改nssswitch.conf 后,nscd进程并不感知文件改变,需要手动重启nscd进程。
# man nscd NOTES Nscd doesn't know anything about the underlying protocols for a service. This also means, that if you change /etc/resolv.conf for DNS queries, nscd will continue to use the old one if you have configured /etc/nsswitch.conf to use DNS for host lookups. In such a case, you need to restart nscd.
总结
解析流程,nscd --> hosts --> dns(默认两次重试),nscd内部hosts–> dns(一次重试)。
尽量写入hosts, 虽然不便于管理,但是稳定高效。应当优先尽量将对应关系写入hosts文件,这样也能避免分布式的解析请求给DNS服务带来不必要的压力。
优化配置, resolv.conf 中默认的超时时间和重试次数可进行调优,在无法解析的情况下减小性能影响。
相关链接
- 点赞
- 收藏
- 关注作者
评论(0)