[ale] Python (or other) socket identification

Fri Feb 3 03:37:15 EST 2017

I've been trying to figure out a method of tracking some sockets but I
keep running into roadblocks.

I've got a small daemon python script that runs a thread to accept
connections from remote devices.  All the basic stuff is working so no
problems there.  What I'm trying to do is make an asynchronous queue for
the thread so I can send data back towards those devices.

The queue idea was to put the device IP address and the data to send
into the queue (all IPs are static and known).  The daemon thread loops
continuously looking for status updates from the remote devices (via
select() ).  At the end of the loop, it would check the queue for new
items and send the data to the appropriate device.

I currently keep a list of the socket objects that get created for every
new connection.  This list gets fed into select() during each loop to
tell me where there's data waiting to be received from some device.  The
devices don't send regular updates on a rapid basis.  Instead they
maintain an open connection and send updates once every couple minutes
unless some event occurs and then they send an immediate update.

Problem: I don't want to wait for one of the devices to send its message
so I can send the data back.  I want to send as soon as the data is in
the queue.

First brute force solution is to loop through all the available sockets,
check to see if the remote IP matches the one in the queue, then send
data to it.  Works but can get bogged down as it does a getpeer() (to
get the remote IP) on each socket until it finds a match.

I didn't like that idea so I thought about two related lists.  One was
keyed on the IP and the other keyed on the socket.  The IP key works
fine but you can't use an object as a key in Python.  The reason for the
second list keyed on socket is to avoid having to run through the IP
keyed array looking for a matching value when trying to purge a dead
socket (i.e. looping through the list of socket objects that select()
returned, if one is dead you can't do a getpeer() on it so go to the
second list, search for the socket ID and get its IP)

Ok, so next idea was to use the file descriptor of the socket to get
around the key problem.  I thought that was going to work until I ran
into the hiccup of the socket possibly disappearing (timed out
connection, device powers off, etc.) When the socket closes for whatever
reason, the file descriptor goes away although the object persists until
it's garbage collected after explicitly close()d.  If the file
descriptor vanishes, I can't use it to purge the lists.

So I'm trying to figure out if there's a more persistent identifier
available that can be used as a key and will persist through a socket
termination and can also be used to reference the socket it came from

The gist of the daemon (this runs in a loop) in rough Python (some
pseduo-Python for brevity):

read,write,error = select(socketlist, [], [], 0) # no blocking

for s in read:
	if s == the_server_socket:
		client, address = s.accept()
		socketlist.append(client)
		#right here I would want to add:
		client_track_list_by_ip[address] = client.identifier
		client_track_list_by_identifier[client.identifier] = ip

	else:
		#this was a client device, do things

	#this bit of code runs in various places
	#the idea is to remove the socket from the
	#socketlist if it's dead

	(if the socket is dead or caused an error):
		socketlist.remove(s)
		#here I would attempt to purge the two tracking lists
		#get peer won't work here because the socket
		#isn't connected, go find it in the other list
		ip = client_track_list_by_identifier[s.identifier]
		del client_track_list_by_ip[ip]
		del client_track_list_by_identifier[s.identifier]
		s.close()

#then the queue handling:
while ( queue is not empty ):
	queue_item = getfromqueue()

	send_to_socket(client_track_list_by_ip[queue_item.ip], queue_item.data)