|
IDS Forum
Re: How to Monitor Open Connection Tiime???
Posted By: Fernando Nunes Date: Friday, 3 December 2010, at 12:48 p.m.
In Response To: Re: How to Monitor Open Connection Tiime??? (Cesar Inacio Martins)
My glibc is:
[root@pacman ~]# rpm -qa | grep glibc
glibc-2.12-3.i686
You can prove/test the behavior with the simple C program I sent you... You
should see different traces depending on the OS.
Assuming your client can understand C (that example is simple...), that you
can show it on their system also.
Anyway. It's now clear that it will not work until you restart. If you don't
want/can't to restart your only option is to "hack" the TCP stack somehow
and make the old DNS server IP available.
You don't need to have a DNS server there... If it sends the packets and
there is nothing on port 53 UDP it will give up quickly...
Alternatively you could open a case with RH... Maybe they have a way to deal
with that... If not, they may be interested in fixing it... You have a test
case, and my past experiences tell me that it's half way to a fix... (or
more...)
Regards.
On Fri, Dec 3, 2010 at 5:33 PM, Cesar Inacio Martins <
cesar_inacio_martins@yahoo.com.br> wrote:
> Ooops.. sorry, I confusing my self when I want say resolv.conf and says
> hosts.equiv...
>
> the correct is : "- Each open connection, always reread resolv.conf...".
>
> So, I executed some tests here (my netbook), now using the same version of
> our
> production (11.50 xC7W1GE).
> And works too!
>
> I started the database with strace and keep a tail over the files.
> Between this two block I changed the /etc/resolv.conf , the dns from
> 8.8.8.8
> to 1.1.1.1 and try open a new connection from differ host.
>
> (check lines marked with "###" by me)
> -----------------------------------
> $tail -n +0 -f ifx* | egrep "add|nscd|host|resol"
>
> ###14:01:48 accept(7, {sa_family=AF_INET6, sin6_port=htons(29315),
> inet_pton(AF_INET6, "::ffff:172.18.0.104",&sin6_addr), sin6_flowinfo=0,
> sin6_scope_id=0}, [28]) = 4
> ###14:01:48 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
> ....}) = 0
> ###14:01:48 open("/etc/hosts.equiv", O_RDONLY|O_LARGEFILE) = 3
> ###14:01:48 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
> ....}) = 0
> 14:01:48 read(257, "#\n# hosts.equiv This file desc"..., 4096) = 230
> 14:01:48 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
> 14:01:48 read(3, "#\n# hosts This file desc"..., 4096) = 3957
> 14:01:48 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=850,
> ...})
> = 0
> ###14:01:48 connect(3, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("8.8.8.8")}, 28) = 0
> 14:01:48 send(3,
> "x8\1\0\0\1\0\0\0\0\0\0\003104\0010\00218\003172\7in-add"...,
> 43, MSG_NOSIGNAL) = 43
> 14:01:48 recvfrom(3,
> "x8\201\203\0\1\0\0\0\0\0\0\003104\0010\00218\003172\7in-add"..., 1024, 0,
> {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")},
> [16])
> = 43
>
> ###14:02:09 accept(7, {sa_family=AF_INET6, sin6_port=htons(46297),
> inet_pton(AF_INET6, "::ffff:172.18.0.105",&sin6_addr), sin6_flowinfo=0,
> sin6_scope_id=0}, [28]) = 4
> 14:02:09 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
> 14:02:09 read(3, "#\n# hosts This file desc"..., 4096) = 3957
> ###14:02:09 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=871,
> ....}) = 0
> ###14:02:09 open("/etc/resolv.conf", O_RDONLY) = 3
> ###14:02:09 read(3, "### /etc/resolv.conf file autoge"..., 4096) = 871
> ###14:02:09 connect(3, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 28) = 0
> 14:02:09 send(3,
> "\204H\1\0\0\1\0\0\0\0\0\0\003105\0010\00218\003172\7in-add"..., 43,
> MSG_NOSIGNAL) = 43
> 14:02:14 send(3,
> "\204H\1\0\0\1\0\0\0\0\0\0\003105\0010\00218\003172\7in-add"..., 43,
> MSG_NOSIGNAL) = 43
> 14:02:19 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
> ...})
> = 0
> 14:02:19 open("/etc/hosts.equiv", O_RDONLY|O_LARGEFILE) = 3
> 14:02:19 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
> ...})
> = 0
> 14:02:19 read(257, "#\n# hosts.equiv This file desc"..., 4096) = 230
> -----------------------------------
>
> And this behave, reread the resolv.conf occur only the first time what this
> IP
> make a connection, if close and reopen, they don't try resolve the DNS
> again... only a few minutes later (probably some kind of cache timeout).
>
> Include this IPs to /etc/hosts, was the first thing what our Linux Admin
> does
> , but is a lot and this can change dynamically...so, isn't a option.
>
> Now I'm start to believe the Informix innocence :) and guilty the O.S. ,
> but I
> need some way to prove this.
>
> After some research about the functions gethostbyaddr / gethostbyip , I
> found
> them are part of the "resolver" lib, what is from GLIBC.
>
> My OpenSuse 11.2 (recent updated) , the Glibc is 2.10 (kernel 2.6.31)
> The production environment (updated aug/2010) , Red Hat 5.5 , the Glibc is
> 2.5
>
> Fernando, what's version is your glibc ?
>
> I looking for the changelogs / patches for this functions and try identify
> they are the problem and what version of glibc already able to solve this.
> Looking the changelog of my glibc (rpm -q --changelog glibc) , have a fews
> changes over the resolv.conf.. but nothing in particular for this
> situation.
>
> Regards
> Cesar
>
> On 12/03/2010 01:47 PM, Fernando Nunes wrote:
> > On Fri, Dec 3, 2010 at 12:44 PM, Cesar Inacio Martins<
> > cesar_inacio_martins@yahoo.com.br> wrote:
> >
> >> Hi Fernando,
> >>
> >> I wrote a little program here and confirm what you said about not "see"
> >> the call of gethostbyaddr...
> >>
> >> But........ I got new variables....
> >> Testing on my net book , IFX 11.70 xC1 (isn't the same version used on
> >> our production) + OpenSuse 11.2 , open connections with changes on the
> >> resolv.conf, tracing with strace:
> >> - Each open connection, always reread hosts.equivs. (I believed this the
> >> res_init() function working inside of the gethostbyaddr).
> >>
> > hosts.equiv has nothing to do with DNS. It's for trusted connections, and
> > yes, it's read every time (or maybe when it changes, depending on the
> OS).
> >
> >> - When I change the resolv.conf , without bounce the instance, the next
> >> connection already try solve the DNS using the new configuration.
> >>
> > Not on my system (with the test program). I was using Fedora 13.
> >
> >> So, now the question, is the Informix or Linux problem ??
> >> I will install the same version what we have the problem on my netbook ,
> >> ifx 11.50 uc7w1ge (but 32 bits..) and test.
> >>
> > On my system, I tested without Informix, so it was definitively a problem
> > with the gethostbyaddr() function.
> > Note that calling this a "problem" is a bit simplistic... I'm not sure we
> > would want it to re-read the file on each request...
> >
> >> Just for curiosity, read this thread :
> >> http://fixunix.com/redhat/17199-etc-resolv-conf-how-reload.html
> >>
> >>
> > Too long! :) Sorry. Only later.
> >
> >> Is a similar situation, not with Informix and over RH 4 (not 5.5) .
> >>
> >> know my suspicious now is over glibc used on this RH.
> >> When I finish my test on my netbook with the same version what we use
> >> here in production , I will post here the results.
> >>
> >>
> > What about the possibility of adding the hosts to your /etc/hosts file?
> Are
> > there many connection points in this situation?
> >
> >> On 12/01/2010 11:07 PM, Fernando Nunes wrote:
> >>> On Wed, Dec 1, 2010 at 6:01 PM, Cesar Inacio Martins<
> >>> cesar_inacio_martins@yahoo.com.br> wrote:
> >>>
> >>>> Hi Fernando!
> >>>> Thanks for your message!
> >>>> Sorry take to long to answer, I don't know why the last 2 days of
> >>>> messages incoming from IIUG has arrived just now.
> >>>>
> >>>> So, about the gethostbyaddr() , I'm not sure about that because isn't
> >>>> what we see on the strace .
> >>>>
> >>> I wrote a simple program in Fedora, to test your situation.... Several
> >>> interesting things came up. They're not very helpful, but the prove
> >> Informix
> >>> innocence :)
> >>>
> >>> On strace you will not see the reference to gethostbyaddr. Only in a
> >>> debugger or if you force a stack trace in the precise moment...
> >>>
> >>>> The request of DNS reverse lookup, appear to be executed "manually" by
> >>>> Informix or the strace just traced the gethostbyaddr() too.
> >>>> Check the output marked with "##" by me.
> >>>> where: 172.18.0.57 and 172.18.0.119 are the OLD DNS when the server
> was
> >>>> started. When we run this strace, the resolv.conf already have new
> >>>> values (for at least 2 days):
> >>>>
> >>>> 11:24:28 munmap(0x2abeb6d75000, 4096) = 0
> >>>> 11:24:28 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
> >>>> ##11:24:28 connect(3, {sa_family=AF_INET, sin_port=htons(53),
> >>>> sin_addr=inet_addr("172.18.0.57")}, 28) = 0
> >>>> 11:24:28 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
> >>>> 11:24:28 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> >>>> 11:24:28 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3,
> >>>> revents=POLLOUT}])
> >>>> ##11:24:28 sendto(3,
> >>>> "\242;\1\0\0\1\0\0\0\0\0\0\003185\00248\00222\003172\7in-ad"..., 44,
> >>>> MSG_NOSIGNAL, NULL, 0) = 44
> >>>> ##11:24:28 poll([{fd=3, events=POLLIN}], 1, 5000) = 0 (Timeout)
> >>>> 11:24:33 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
> >>>> 11:24:33 connect(4, {sa_family=AF_INET, sin_port=htons(53),
> >>>> sin_addr=inet_addr("172.18.0.119")}, 28) = 0
> >>>> 11:24:33 fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
> >>>> .... they continue trying to third DNS server...
> >>>> (check the timestamp,.. 5 seconds of timeout delay)
> >>>>
> >>>> Before you ask, the Linux Admin already try change the timeout (option
> >>>> into resolv.conf) and don't have effect....
> >>>>
> >>>>
> >>> 2nd interesting observation... I put the program in loop... reading an
> IP
> >>> from the console and trying to reverse DNS it.
> >>> Between loops I changed my resolve.conf.... It didn't matter for the
> >> running
> >>> process. I belive gethostbyaddr creates some static structures. First
> >> time
> >>> it runs it reads the resolv.conf, but it doesn't happen again...
> Probably
> >>> due to performance reasons. But it's inconvenient in your
> situation....:
> >>>
> >>> open("/etc/resolv.conf", O_RDONLY) = 3
> >>> fstat64(3, {st_mode=S_IFREG|0644, st_size=55, ...}) = 0
> >>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0)
> >> =
> >>> 0xb776e000
> >>> read(3, "# Generated by NetworkManager\nna"..., 4096) = 55
> >>> read(3, "", 4096) = 0
> >>> close(3)
> >>>
> >>> This is why it keeps trying the old DNS...
> >>>
> >>> Sorry , yes is a 11.50 FC7 GE over Linux Red Hat 5.5
> >>>> This option to create a group is nice, but I not sure if is viable for
> >>>> this environment because have a lot of applications spread on the
> >>>> company (.net/windows, java/web, C/Linux) what we will need care about
> >>>> to change the SQLHOSTS...
> >>>>
> >>>> The nsswitch.conf is ok, it isn't modified from they default values...
> >>>> and the problem is with connections what become from out of our
> network
> >>>> (what don't exists into /etc/hosts) and create consequences to local
> >>>> connections (what exists in hosts) when the instance get trouble to
> try
> >>>> resolve their hostnames (not local clients)
> >>>>
> >>> Are there many client IPs from outside the network? If the number is
> low
> >> you
> >>> could put them in /etc/hosts...
> >>>
> >>>> And the problem what we detected is over MSC, because is it what
> >> freeze...
> >>>> The most weird thing is, when the MSC "freeze" (stuck running in
> active
> >>>> threads: onstat -g act) they stack dump (onstat -g stk #threadid)
> don't
> >>>> change anything, still showing in yield process...and just change the
> >>>> status from sleep to running....
> >>>>
> >>>>
> >>> Now... none of these are good news. I really can't see a good solution
> >>> besides restart... You're a victim of how the TCP/IP name resolution
> >> works
> >>> Apparently nscd will not work also, because gethostbyaddr only tries it
> >> the
> >>> first time...
> >>>
> >>> Some other thoughts:
> >>>
> >>> 1- You could launch more msc VPs. The new ones should pick up the
> correct
> >>> DNS. And before you ask, no, you cannot remove MSC VPs...
> >>> But with more MSC VPs your chances of getting stuck should be lower
> >>> 2- (this is very weird...) You could use IPTABLEs to hijack connections
> >> to
> >>> the OLD DNS and redirect them to the new one... Not sure if this is
> >>> possible, but you would have to create a rule for the old IP, port 53
> and
> >>> protocol UDP.
> >>> 3- You could "hack" the network to make the old IP available. This
> could
> >> be
> >>> done by putting a machine on the network with that IP, or eventually by
> >>> hacking the ARP table
> >>> 4- you could create another TCP interface on the machine (how?) with
> the
> >> IP
> >>> of the old DNS server. I tried putting my own IP on the
> /etc/resolv.conf
> >>> file and the reverse DNS returns to "quick" (although I don't have
> >> anything
> >>> listening on port 53)...
> >>> The problem you're facing comes from the fact that the IP cannot be
> >> reached.
> >>> If you make it go to an existing IP it will not resolv, but it will be
> >>> quick.
> >>>
> >>> Naturally 2, 3 and 4 are "crazy" suggestions... But I don't see any
> >> "sane"
> >>> alternative to an instance stop.
> >>>
> >>> Regards.
> >>>
> >>
> >>
> >>
> >
>
> *******************************************************************************
> >> Forum Note: Use "Reply" to post a response in the discussion forum.
> >>
> >>
>
>
>
> *******************************************************************************
> Forum Note: Use "Reply" to post a response in the discussion forum.
>
>
--
Fernando Nunes
Portugal
http://informix-technology.blogspot.com
My email works... but I don't check it frequently...
--0015174c0c72d6cef604968524b1
Messages In This Thread
IDS Forum is maintained by Administrator with WebBBS 5.12.
|
|