Discussion:
Strange behaviour with ptmx file descriptors
(too old to reply)
Ángel González
2016-04-15 00:06:53 UTC
Permalink
Hello Nacho
Hi list
(…)
Transmission KO
read(13, "\7\33[1;16H \33[6;6H_______\33[7;1H -INFORME CANT.
RECOGIDA-\33[7;26H", 16384) = 67
write(3,
"P\247\244}\277\322\260\21\3314\7\227\223~\317\360\35\334\232\372\237\250\320\312\1;\25\37\23\363\363O&0\355i{zUbr\365,\362yyl\222",
48) = 48
It seems the file descriptor disappears while the system call (read) is
reading from it?
No, the file descriptor will still be there.

sshd is taking the program output from the terminal (13) and sending out
through the socket
(3). What's the corresponding strace of the client ssh?
I ran out of ideas. Can anybody drive me to the right direction?
Regards, Nacho.
What's the terminal emulator used in the client side? Does resetting it
(from a menu option) make it receive the server output again?


Cheers
Nacho del Rey
2016-04-15 09:28:16 UTC
Permalink
Hi Angel and many thanks for your answer

The application still sending & receiving data
.- strace over the application:

write(1, "\33[1;1H\237#SF \234", 44) = 44
<--it was sent from the application to the terminal, but ssh didn't
received this string
read(0, "\10", 1024) = 1 <- the client remained sending data and
it was received by the application
read(0, "\n", 1024) = 1
write(1, "\33[6;6H", 6) = 6 <-- the answered '\n' but it wasn't
received by the ssh

finally, several minutes later, the operator realized the screen is frozen
and he closed the ssh client

The strange thing is all file descriptors of all processes are in place but
it seems for any strange reason, something happen to fd#13 ( C Program fd
#1 --> ssd read fd #13) because it disappears from the select call (it
won't be available in the read fd set ever)

ssh client is made by Honeywell company.

This behaviour is similar as when you send a Ctrl+S from the ssh client but
in this case, fd#13 is not lost.

Kind regards

Nacho.
Post by Ángel González
Hello Nacho
Hi list
(…)
Transmission KO
read(13, "\7\33[1;16H \33[6;6H_______\33[7;1H -INFORME CANT.
RECOGIDA-\33[7;26H", 16384) = 67
write(3,
"P\247\244}\277\322\260\21\3314\7\227\223~\317\360\35\334\232\372\237\250\320\312\1;\25\37\23\363\363O&0\355i{zUbr\365,\362yyl\222",
48) = 48
It seems the file descriptor disappears while the system call (read) is
reading from it?
No, the file descriptor will still be there.
sshd is taking the program output from the terminal (13) and sending out
through the socket
(3). What's the corresponding strace of the client ssh?
I ran out of ideas. Can anybody drive me to the right direction?
Regards, Nacho.
What's the terminal emulator used in the client side? Does resetting it
(from a menu option) make it receive the server output again?
Cheers
Ángel González
2016-04-15 21:53:53 UTC
Permalink
Post by Nacho del Rey
Hi Angel and many thanks for your answer
The application still sending & receiving data
I asked for the corresponding strace of the client ssh, where the data
should have been received and printed to the console.
Post by Nacho del Rey
The strange thing is all file descriptors of all processes are in
place but it seems for any strange reason, something happen to fd#13 (
C Program fd #1 --> ssd read fd #13) because it disappears from the
select call (it won't be available in the read fd set ever)
Interesting. I wonder if that's expected and sshd waits for
acknowledgment from the remote ssh before reading again.
Post by Nacho del Rey
What's the terminal emulator used in the client side? Does
resetting it (from a menu option) make it receive the server
output again?
Cheers
Best regards
Nacho del Rey
2016-04-16 09:11:32 UTC
Permalink
Hi again

I think I didn't explain the problem properly

I'll try to explain it using an example of what it happened last time in
Feb, the 23th

This is the point of view of the C application, running in the linux box,
when answering to the client:

4911 1456244042.324336 write(1, "\7", 1) = 1
4911 1456244042.324610 write(1, "\33[1;16H \33[6;6H_______\33[7;1H
-INFORME CANT. RECOGIDA-\33[7;26H", 66) = 66

This is the point of view of the ssh process in the linux box:

6647 1456244042.323463 select(14, [3 9 13], [], NULL, {900, 0}) = 1 (in
[13], left {899, 998569})
6647 1456244042.324971 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
6647 1456244042.325010 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
6647 1456244042.325033 read(13, "\7\33[1;16H
\33[6;6H_______\33[7;1H -INFORME CANT. RECOGIDA-\33[7;26H", 16384) = 67
6647 1456244042.325126 select(14, [3 9], [3], NULL, {900, 0}) = 1 (out
[3], left {899, 999998})
6647 1456244042.325203 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
6647 1456244042.325240 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
6647 1456244042.325264 write(3,
"P\247\244}\277\322\260\21\3314\7\227\223~\317\360\35\334\232\372\237\250\320\312\1;\25\37\23\363\363O&0\355i{zUbr\365,\362yyl\222",
48) = 48
6647 1456244042.325310 select(14, [3 9], [], NULL, {900, 0}) = 1 (in [3],
left {892, 12016}) <<-- where id fd#13 ?


This is the point of view of the ssh client:

74539.285 0BE6003A I-ITE 38:VT220_IO:
tohost:0x31='1'

74539.300 0BE6003A I-ITE 52:ANSITERM:
charin:<ASCII(27)>

74539.300 0BE6003A I-ITE 53:ANSITERM:
charin:<[>

74539.300 0BE6003A I-ITE 54:ANSITERM:
charin:<1>

74539.300 0BE6003A I-ITE 55:ANSITERM:
charin:<;>

74539.300 0BE6003A I-ITE 56:ANSITERM:
charin:<2>

74539.300 0BE6003A I-ITE 57:ANSITERM:
charin:<3>

74539.300 0BE6003A I-ITE 58:ANSITERM:
charin:<H>

74539.301 0BE6003A I-ITE 59:ANSITERM:
charin:<1>

74539.301 0BE6003A I-ITE 63:ANSITERM:
charin:<ASCII(27)>

74539.301 0BE6003A I-ITE 64:ANSITERM:
charin:<[>

74539.301 0BE6003A I-ITE 65:ANSITERM:
charin:<1>

74539.301 0BE6003A I-ITE 66:ANSITERM:
charin:<;>

74539.302 0BE6003A I-ITE 67:ANSITERM:
charin:<2>

74539.302 0BE6003A I-ITE 68:ANSITERM:
charin:<4>

74539.302 0BE6003A I-ITE 69:ANSITERM:
charin:<H>

74539.302 0BE6003A I-ITE 70:ANSITERM:
charin:<ASCII(7)>

74539.346 0BE6003A I-ITE 71:ANSITERM:
charin:<ASCII(27)>

74539.347 0BE6003A I-ITE 72:ANSITERM:
charin:<[>

74539.347 0BE6003A I-ITE 73:ANSITERM:
charin:<1>

74539.347 0BE6003A I-ITE 74:ANSITERM:
charin:<;>

74539.347 0BE6003A I-ITE 75:ANSITERM:
charin:<1>

74539.347 0BE6003A I-ITE 76:ANSITERM:
charin:<6>

74539.347 0BE6003A I-ITE 77:ANSITERM:
charin:<H>

74539.347 0BE6003A I-ITE 78:ANSITERM:
charin:< >

74539.348 0BE6003A I-ITE 82:ANSITERM:
charin:< >

74539.348 0BE6003A I-ITE 86:ANSITERM:
charin:< >

74547.277 0BE6003A I-ITE 97:VT220_IO:
tohost:0x08='.' <-- here the client stops receiving data

74547.779 0BE6003A I-ITE 08:VT220_IO:
tohost:0x08='.'

74547.821 0BE6003A I-ITE 19:VT220_IO:
tohost:0x08='.'

74547.864 0BE6003A I-ITE 30:VT220_IO:
tohost:0x08='.'

74547.907 0BE6003A I-ITE 41:VT220_IO:
tohost:0x08='.'

74547.950 0BE6003A I-ITE 52:VT220_IO:
tohost:0x08='.'

74547.993 0BE6003A I-ITE 63:VT220_IO:
tohost:0x08='.'

74548.036 0BE6003A I-ITE 74:VT220_IO:
tohost:0x08='.'

And when it happens, the ssh client still working properly, it continues
sending keystrokes to the the application, the application receives these
keystrokes, it answers to the client, but the answer is not yet received by
the sshd process (so it seems the application is frozen... when it is not
¿true?)

One time, the operator tried to reset the connection from the ssh client,
but no way

The C program works properly thru telnet.. no incident has been reported
ever

I hope this information helps

Many thanks

Regards

Nacho.
Post by Nacho del Rey
Hi Angel and many thanks for your answer
The application still sending & receiving data
I asked for the corresponding strace of the client ssh, where the data
should have been received and printed to the console.
The strange thing is all file descriptors of all processes are in place
but it seems for any strange reason, something happen to fd#13 ( C Program
fd #1 --> ssd read fd #13) because it disappears from the select call (it
won't be available in the read fd set ever)
Interesting. I wonder if that's expected and sshd waits for acknowledgment
from the remote ssh before reading again.
What's the terminal emulator used in the client side? Does resetting it
Post by Ángel González
(from a menu option) make it receive the server output again?
Cheers
Best regards
Peter Stuge
2016-04-16 12:14:05 UTC
Permalink
Post by Nacho del Rey
6647 1456244042.325310 select(14, [3 9], [], NULL, {900, 0}) = 1 (in [3],
left {892, 12016}) <<-- where id fd#13 ?
Can you map this back to the OpenSSH source code?


//Peter
Nacho del Rey
2016-04-20 14:52:45 UTC
Permalink
Hi Peter

Can you let me know how to proceed for forward this to the openssh source
code team?

Regards
Post by Peter Stuge
Post by Nacho del Rey
6647 1456244042.325310 select(14, [3 9], [], NULL, {900, 0}) = 1 (in
[3],
Post by Peter Stuge
Post by Nacho del Rey
left {892, 12016}) <<-- where id fd#13 ?
Can you map this back to the OpenSSH source code?
//Peter
Peter Stuge
2016-04-20 21:08:31 UTC
Permalink
Post by Nacho del Rey
Can you let me know how to proceed for forward this to the openssh
source code team?
Map your strace syscalls to the source code and point out where the
problem is. Do debugging on your own. Send a patch.


//Peter
Nacho del Rey
2016-06-18 18:50:26 UTC
Permalink
Hi again

Finally I recompiled the ssh to include several 'traps' within the code
(channels.c mainly)

channel_pre_open(Channel *c, fd_set *readset, fd_set *writeset)
{
u_int limit = compat20 ? c->remote_window : packet_get_maxsize();

int aux = buffer_len(&c->input);
debugnacho("En pre_open c-istate: %d limit %d buffer_len %d c_ostate %d
ctl_fd %d\n",c->istate,limit,aux,c->ostate,c->ctl_fd);

/* the rest of the function code */
}

In normal conditions the buffer_len(&c->input) is adjusting its window size

En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048495 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048495 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048415 buffer_len 0 c_ostate 0 ctl_fd -1
...... time later
En pre_open c-istate: 0 limit 1002267 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1002267 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1002267 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 998560 buffer_len 0 c_ostate 0 ctl_fd
-1
En pre_open c-istate: 0 limit 998560 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1

But, If I compare the same trace when the the hand-terminal connects to the
application...

En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048476 buffer_len 0 c_ostate 0 ctl_fd -1
......
En pre_open c-istate: 0 limit 985 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 632 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 322 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 322 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1

It seems the channel_input_window_adjust function is not adjusting rwindow
size and it finally comes to zero and it is when the problem appears ( the
file-descriptor fd#13 which connects to the application dissapear)

So the question is.. who is the responsible in adjusting
"channel_input_window_adjust" function? the binary application which runs
in the server or on the contrary the ssh client?

I ask this question because when I'm connected thru ssh (putty), although I
am not sending a keystroke to the server if I'm changing the size of the
putty window (using the mouse), I can see the buffer_len(c->input) changes
, and I don't have clear in which side (client/server) the rwindow is
adjusted

Thanks in advance

Nacho.
Post by Nacho del Rey
Hi Peter
Can you let me know how to proceed for forward this to the openssh source
code team?
Regards
Post by Peter Stuge
Post by Nacho del Rey
6647 1456244042.325310 select(14, [3 9], [], NULL, {900, 0}) = 1 (in
[3],
Post by Peter Stuge
Post by Nacho del Rey
left {892, 12016}) <<-- where id fd#13 ?
Can you map this back to the OpenSSH source code?
//Peter
Nacho del Rey
2016-06-22 10:18:56 UTC
Permalink
Hi again


Finally I found out what it was going on

The problem lays on the Honeywell ssh client (a fork of OpenSSH)
The client doesn't negotiate
SSH_MSG_CHANNEL_WINDOW_ADJUST/SSH2_MSG_CHANNEL_WINDOW_ADJUST parameters

This issue it's not happening using putty ssh client (for instance)

Hope this information helps anyone

[SOLVED]

Nacho.
Post by Nacho del Rey
Hi again
Finally I recompiled the ssh to include several 'traps' within the code
(channels.c mainly)
channel_pre_open(Channel *c, fd_set *readset, fd_set *writeset)
{
u_int limit = compat20 ? c->remote_window : packet_get_maxsize();
int aux = buffer_len(&c->input);
debugnacho("En pre_open c-istate: %d limit %d buffer_len %d c_ostate %d
ctl_fd %d\n",c->istate,limit,aux,c->ostate,c->ctl_fd);
/* the rest of the function code */
}
In normal conditions the buffer_len(&c->input) is adjusting its window
size
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048495 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048495 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048415 buffer_len 0 c_ostate 0 ctl_fd -1
...... time later
En pre_open c-istate: 0 limit 1002267 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1002267 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1002267 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 998560 buffer_len 0 c_ostate 0 ctl_fd
-1
En pre_open c-istate: 0 limit 998560 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
But, If I compare the same trace when the the hand-terminal connects to
the application...
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048576 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 1048476 buffer_len 0 c_ostate 0 ctl_fd -1
......
En pre_open c-istate: 0 limit 985 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 647 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 632 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 322 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 322 buffer_len 0 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
En pre_open c-istate: 0 limit 0 buffer_len 16 c_ostate 0 ctl_fd -1
It seems the channel_input_window_adjust function is not adjusting rwindow
size and it finally comes to zero and it is when the problem appears ( the
file-descriptor fd#13 which connects to the application dissapear)
So the question is.. who is the responsible in adjusting
"channel_input_window_adjust" function? the binary application which runs
in the server or on the contrary the ssh client?
I ask this question because when I'm connected thru ssh (putty), although
I am not sending a keystroke to the server if I'm changing the size of the
putty window (using the mouse), I can see the buffer_len(c->input) changes
, and I don't have clear in which side (client/server) the rwindow is
adjusted
Thanks in advance
Nacho.
Post by Nacho del Rey
Hi Peter
Can you let me know how to proceed for forward this to the openssh source
code team?
Regards
Post by Peter Stuge
Post by Nacho del Rey
6647 1456244042.325310 select(14, [3 9], [], NULL, {900, 0}) = 1 (in
[3],
Post by Peter Stuge
Post by Nacho del Rey
left {892, 12016}) <<-- where id fd#13 ?
Can you map this back to the OpenSSH source code?
//Peter
Loading...