[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Moab Channels




Hi,
 
 I am sorry I was not able to get back to you earlier. I applied the patch
you gave. This is what I noticed. The code enters 'chan_socketdata_intr'
but does not block on recvfrom. I tried increasing the timeout. Instead
the rest of the channel initialization proceeds correctly and exits. The
socktest example in the 'envs' directory works fine though. 

 Also, in the meantime I tested your Ethernet channels. The chantest
example works fine in the scenario C/Moab/OSkit. An analogous example with
OCaml/Moab/OSKit fails as the 'chan_edev_intr' does not block. These
observations seem to suggest that I may have screwed up something in the 
initialization of OCaml.

Thanks for all your help
Ravi
 


On Tue, 8 May 2001, Mike Hibler wrote:

> Here is a patch that should take care of livelock problems in the
> chan_socket code.  It also fixes a stupid bug where we forgot to
> initialize the socket timeout for OSKit, which could also cause problems.
> 
> Now we'll see if that is really what you are experiencing...
> 
> Index: chan_socket.c
> ===================================================================
> RCS file: /n/fast/usr/lsrc/flux/CVS/moab/src/chan_socket.c,v
> retrieving revision 1.53
> diff -u -r1.53 chan_socket.c
> --- chan_socket.c	2001/01/06 06:28:27	1.53
> +++ chan_socket.c	2001/05/08 22:12:27
> @@ -487,6 +487,8 @@
>  	{
>  		int ierr;
>  
> +		timeo.tv_sec = 0;
> +		timeo.tv_usec = 500000;	/* 500ms */
>  		ierr = setsockopt(s, SOL_SOCKET, SO_RCVTIMEO,
>  				  &timeo, sizeof timeo);
>  		assert(ierr == 0);
> @@ -506,6 +508,18 @@
>  	bdata = ani_pbuf_data(rbuf);
>  	bsize = ani_pbuf_size(rbuf);
>  
> +	/*
> +	 * XXX run the loop at lowest priority.  This is counter to what a
> +	 * traditional interrupt handler does but its just too easy to
> +	 * livelock here otherwise.  This choice can lead to starvation
> +	 * in a system with other busy flows.  This is counter to our
> +	 * resource isolation tenet and needs to be addressed.  The quick
> +	 * and easy way would be to make sure that flows cannot share
> +	 * sockets and to make the service thread be a flow thread.
> +	 */
> +	err = ani_thread_setprio(ani_thread_current(), 0);
> +	assert(!ani_error_occured(err));
> +
>  	while (1) {
>  		chan = 0;
>  		
> @@ -607,14 +621,6 @@
>  		 */
>  		bdata = ani_pbuf_data(rbuf);
>  		bsize = ani_pbuf_size(rbuf);
> -
> -		/*
> -		 * Yield here so that the packet threads have a chance to
> -		 * do their thing and return the buffer to the inchan.
> -		 * Otherwise we might just use up all the available buffers
> -		 * and have to drop the rest of the packets.
> -		 */
> -		ani_thread_yield();
>  	}
>  
>  	/*
> 






[ Janos ] [ OSKit ] [ Network Testbed ] [ Flick ] [ Fluke ]
Flux Research Group / Department of Computer Science / University of Utah