ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

jkesterson · Post by **jkesterson** » Tue Apr 27, 2010 9:34 pm

I am using a SUB-20 board SPI interface as a demo controller for an evaluation board for our company product. I developed the application (a GUI) in C using GTK and used the libusb-1.0.6. The same GUI works under Linux and Windows. When I upgraded my Linux system from Hardy (8.04) to Jaunty (9.04), it quit working (reliably). The application comes up, but with repeated SPI transfers, I get a segmentation violation associated with the spi transfer call.

I re-downloaded the libusb-1.0.6 and recompiled it and installed it on the upgraded machine. I believe I have isolated the problem to this library. My system can dual boot between the old and new version of Linux, and on the new version, whenever I execute a function that does multiple SPI transfers, after a short time, I get a segmentation violation and the program crashes.

Is there anyone out there who has successfully used the sub-20 on a Linux Jaunty 9.04 system? Also, is there a different version of the libusb-1.0.6 or libusb-1.0.7 that I should try? (I have tried both).

Thanks for any help that can be offered!
John

Post by **xol** » Wed Apr 28, 2010 9:43 am

Hi,
To try to help you we need additional information

1. So does it work on Jaunty with libusb-1.0.6 ?
2. Do you have specific API call it fails on? Can you send us a peace of code around it?
3. You can turn on tracing with
export SUB_DEBUG=10
Please send us trace before segfault
4. And finally you can run it under gdb to trace back failure.

jkesterson · Post by **jkesterson** » Mon May 03, 2010 9:33 pm

1. So does it work on Jaunty with libusb-1.0.6 ?

It does NOT work with libusb1.0.6 or 1.0.5, or 1.0.7.

Do you have specific API call it fails on? Can you send us a peace of code around it?

First of all, it happens when I am doing a tight loop of repetition. I have seen it occur on 3 API calls:
1:
if( sub_spi_config( dev,
SPI_ENABLE | SPI_CPOL_RISE | SPI_SMPL_SETUP |
SPI_MSB_FIRST | spi_clk_fq, 0 ) != 0 )
2:
ret = sub_gpio_write( dev, VSYNC, &tmp, VSYNC ) ;
ret = sub_gpio_write( dev, 0, &tmp, VSYNC ) ;
3:
int SUB20_SPI_TRANSFER( /* sub_device dev, */ char *bf_o, char *bf_i, int len, int ss )
{
int ret ;
if( len < 50 )
{
ret = sub_spi_transfer( dev, bf_o, bf_i, len, SS_CONF(ss, SS_LO) ) ;
}
else
{
ret = sub_spi_transfer( dev, bf_o, bf_i, len - 40, SS_CONF(ss, SS_L) ) ;
ret = sub_spi_transfer( dev, bf_o + 40, bf_i + 40, len - 40, SS_CONF(ss, SS_LO) ) ;
}
return ret ;
}

In 3 above, it can get caught on any one of the three sub_spi_transfer calls.

3. You can turn on tracing with
export SUB_DEBUG=10
Please send us trace before segfault

I am sorry, I do not know how to do the export SUB_DEBUG=10 call. However, I have done a strace and ltrace on when the failure occurs.

I have also been able to run it under gdb. it does not always fail on the same caioctl(6, USBDEVFS_SUBMITURB, 0xa173768) = 0
clock_gettime(CLOCK_MONOTONIC, {23939, 906460141}) = 0
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLOUT}], 2, 2000) = ? ERESTART_RESTARTBLOCK (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])
ioctl(6, USBDEVFS_DISCARDURB, 0xa173768) = 0
clock_gettime(CLOCK_MONOTONIC, {23939, 907254449}) = 0
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLOUT}], 2, 2000) = 1 ([{fd=6, revents=POLLOUT}])
ioctl(6, USBDEVFS_REAPURBNDELAY, 0xbf8a1c88) = 0
clock_gettime(CLOCK_MONOTONIC, {23939, 907579282}) = 0
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLOUT}], 2, 2000) = 1 ([{fd=6, revents=POLLOUT}])
ioctl(6, USBDEVFS_REAPURBNDELAY, 0xbf8a1c88) = 0
clock_gettime(CLOCK_MONOTONIC, {23939, 907898527}) = 0
poll([{fd=4, events=POLLIN}, {fd=6, events=POLLOUT}], 2, 2000) = ? ERESTART_RESTARTBLOCK (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])glade_xml_get_widget(0x88d5040, 0x8054166, 0, 0, 0xbfbc0c18) = 0x88d84c8
gtk_object_get_type(0x88d5040, 0x8054166, 0, 0, 0xbfbc0c18) = 0x88db388
g_type_check_instance_cast(0x88d84c8, 0x88db388, 0, 0, 0xbfbc0c18) = 0x88d84c8
gtk_object_set(0x88d84c8, 0x805411c, 0, 0, 0xbfbc0c18) = 1
libusb_alloc_transfer(0, 0x8054497, 11, 0xb6f6d8e4, 0x88d84c8) = 0x8a2099c
libusb_submit_transfer(0x8a2099c, 0x8054497, 11, 0xb6f6d8e4, 0x88d84c8) = 0
libusb_bulk_transfer(0x8a20800, 130, 0xbfbc0ac0, 64, 0xbfbc0a58) = 0
glade_xml_get_widget(0x88d5040, 0x80531a9, 0x804d314, 0, 0) = 0x88d6758
g_type_check_instance_cast(0x88d6758, 80, 0x804d314, 0, 0) = 0x88d6758
g_signal_connect_data(0x88d6758, 0x8052cdf, 0x804ce7c, 0, 0) = 363
glade_xml_get_widget(0x88d5040, 0x80531c1, 0x804ce7c, 0, 0) = 0x88d8458
g_type_check_instance_cast(0x88d8458, 80, 0x804ce7c, 0, 0) = 0x88d8458
g_signal_connect_data(0x88d8458, 0x80531d1, 0x804f2ea, 0, 0) = 364
glade_xml_get_widget(0x88d5040, 0x80531d9, 0x804f2ea, 0, 0) = 0x88d8618
g_type_check_instance_cast(0x88d8618, 80, 0x804f2ea, 0, 0) = 0x88d8618
g_signal_connect_data(0x88d8618, 0x80531d1, 0x804f35c, 0, 0) = 365
glade_xml_get_widget(0x88d5040, 0x80531e9, 0x804f35c, 0, 0) = 0x88d84c8
g_type_check_instance_cast(0x88d84c8, 80, 0x804f35c, 0, 0) = 0x88d84c8
g_signal_connect_data(0x88d84c8, 0x8052cdf, 0x804f4a8, 0, 0) = 366
gtk_widget_show_all(0x89000d8, 0x8052cdf, 0x804f4a8, 0, 0) = 1
gtk_main(0x89000d8, 0x8052cdf, 0x804f4a8, 0, 0 <unfinished ...>
glade_xml_get_widget(0x88d5040, 0x8053fc8, 0xb6d8d9c5, 0xb6e7a140, 64) = 0x8914508
gtk_object_get_type(0x88d5040, 0x8053fc8, 0xb6d8d9c5, 0xb6e7a140, 64) = 0x88db388
g_type_check_instance_cast(0x8914508, 0x88db388, 0xb6d8d9c5, 0xb6e7a140, 64) = 0x8914508
gtk_object_set(0x8914508, 0x8053ce8, 0x8054013, 0, 64) = 1
sigemptyset(0x080585c4) = 0
sigaction(14, 0x080585c0, NULL) = 0
setitimer(0, 0x80585ac, 0, 0, 64) = 0
gtk_events_pending(0, 0x80585ac, 0, 0, 64) = 1
gtk_main_iteration(0, 0x80585ac, 0, 0, 64 <unfinished ...>
--- SIGALRM (Alarm clock) ---
--- SIGALRM (Alarm clock) ---
--- SIGALRM (Alarm clock) ---
--- SIGALRM (Alarm clock) ---
--- SIGALRM (Alarm clock) ---
--- SIGALRM (Alarm clock) ---
<... gtk_main_iteration resumed> ) = 0
gtk_events_pending(0, 0x80585ac, 0, 0, 64) = 0
libusb_alloc_transfer(0, 0x8054497, 11, 0xb6c1c738, 0x88c4d08) = 0x8a43d84
libusb_submit_transfer(0x8a43d84, 0x8054497, 11, 0xb6c1c738, 0x88c4d08 <unfinished ...>
--- SIGALRM (Alarm clock) ---
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

By the way, the ltrace is quite long, so I have only included the last 50 lines or so.

Using GDB, I have run it and looked at the stack frame from where my code calls are made, and there do not appear to be any undefined or out of range settings.

Unfortunately, the code is quite large and complicated. However, it is not secret, so if you would like, I can zip it up and send it to you. It is a pretty large GUI that we use to demo an IC to customers. it works flawlessly with Windows and Ubuntu 8.04.

I hope this information is helpful. I look forward to hearing from you.

Thanks and very best regards,
John
clock_gettime(CLOCK_MONOTONIC, {23939, 911313219}) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++ll.

The ltrace is as follows:

Here is the strace:

jkesterson · Post by **jkesterson** » Mon May 03, 2010 9:49 pm

PS.
I just realized what you were saying about the export statement.
I will try that as soon as I get back tot he board.
Thanks allot,
John

jkesterson · Post by **jkesterson** » Mon May 03, 2010 9:57 pm

I have done the export SUB_20=10, and I get the following:
BULK_OUT 11 bytes:
0a 62 08 00 20 00 00 00 20 00 00
Transfer completed, transferred=11
BULK_IN 7 bytes:
06 62 04 00 20 00 00
BULK_OUT 11 bytes:
0a 62 08 00 20 00 00 00 00 00 00
Transfer completed, transferred=11
BULK_IN 7 bytes:
06 62 04 00 00 00 00
BULK_OUT 4 bytes:
03 40 01 c0
Transfer completed, transferred=4
BULK_OUT 50 bytes:
31 43 2f 15 2d 00 40 00 02 00 80 20 08 02 00 80
20 08 02 00 80 20 08 02 00 80 20 08 02 00 80 20
08 02 00 80 20 08 02 00 80 20 08 02 00 80 20 08
02 00
Segmentation fault

It looks a little different each time as it can fail on any of my BULK_OUT or IN functions; but they all seem to have completed. I never see one where the data in the transfer doesn't make sense.

Thanks again!
John

Post by **xol** » Tue May 04, 2010 8:55 am

HI,
I think the best we can do is to try to redact your code to something minimal that can still generate a segfault. Please try to extract only problematic portion.
In this case we could try to run it on our side and check.
Another option is to try our sub_app in loop. It has repeat option -r. Please take a look at it and try to run it. Tell us if it fails and if so we will do the same on our side. The problem can be related to libusb, kernel, or even specific PC USB host controller implementation. So it is good to get problem localized on your environment and after that check it deeper on other systems.

jkesterson · Post by **jkesterson** » Wed May 05, 2010 3:22 pm

Hello and thank you for your comments,
I believe you are right on track. I spent several hours debugging with gdb and it appears that there is some interaction with the ITIMER_REAL and the USB libusb functions. I can run the operation with ITIMER_VIRTUAL just fine with only a slight cost in timing accuracy, but when ITIMER_REAL is used, the segmentation violation occurs.

I had originally thought it might be an interaction between gtk, glade, and libusb; but now that I understand that it is only between the timer and libusb, I will see if I can put together a simple application that duplicates the problem.

I had also thought that it might be kernel related; however I tested the code out on several different machines running Jaunty (9.04) and the failure was consistent with all of them. It could in fact still be something related to the kernel; but if I compile a special kernel that works, it would only mask the problem since I don't want every machine that runs the app to have to have a special kernel. So for now I am going with the virtual timer.

In the mean time, I will put together a small app that duplicates the problem without all the baggage of my app. When I get it set up able to duplicate the failure, I will send it to you.

Also, running in repeat mode does not cause the failure unless it is being operated with the itimer using the setitimer function.

Thanks again for your looking at this.

Very best regards,
John

jkesterson · Post by **jkesterson** » Wed May 05, 2010 8:22 pm

Hello again,

I have placed a short program below that will duplicate the problem. I have placed a define statement that can be used to easily switch the program between use of the real time and virtual timers. When using the real time timer, the segmentation violation occurs. When using the virtual one, it works fine.

You can observe the timing by monitoring the GPIO 13 line.

There are two files. They are SPI_IO.h (very short) and rt_op.c (about 100 lines).

In normal operation, this will write successively to the GPIO and SPI buss for 20,000 cycles.
Each cycle is 5000 usec, so it goes for about 100 seconds. This is usually long enough to catch a segmentation violation.
Sometimes it occurs right away, and other times it takes 10 to 20 seconds.

Again, operation is reliable with the virtual timer (comment out the "define RT" line.

It seems from this that something in the libusb (or perhaps libsub?) is incompatible with something to do with ITIEMR_REAL.

Anyway, I hope this test case helps. As I mentioned before, I have a solution (use the virtual timer), but I would still be interested to know why this happens.

By the way, I have also verified that the real time timer works fine when not making the sub calls.

Thanks again and best regards,

John.
P.S. The code follows:

// BEGIN Code ---- SPI_IO.h
#include <libsub.h>

#define SCLK_EN 0x80 // PA7
#define MOSI_EN 0x02 // PA1
#define MISO_EN 0x08 // PA3
#define CSB_EN 0x01 // PA0
#define CSB_EN_ALT 0x04 // PA2

#define VSYNC 0x2000 // GPIO 13
#define BOOST_EN 0x1000 // GPIO 12

#define SPI_CLK_FQ SPI_CLK_8MHZ

// END of SPI_IO.h

// BEGIN Code - rt_opt.c
#include <stdlib.h>
#include <stdio.h>
#include <libsub.h>
#include <time.h>
#include <sys/time.h>
#include <signal.h>
#include <unistd.h>
#include <errno.h>
#include "SPI_IO.h"
#include <string.h>

#define RT // If this is defined, you can duplicate the segmentation violation...
#ifdef RT
#define ITIMER_NUM ITIMER_REAL
#define SIG_NUM SIGALRM
#else
#define ITIMER_NUM ITIMER_VIRTUAL
#define SIG_NUM SIGVTALRM
#endif

sub_device dev ;
int soft_vsync = 0 ;
int soft_vsync_on = 1 ;
char bf_o[1024], bf_i[1024] ;

struct itimerval value ;
int which = ITIMER_NUM ;
struct sigaction sact ;

int open_SPI()
{
dev = sub_open( 0 ) ;
if( !dev )
return 0 ;
return 1 ;
}

int pulse_vsync( /* sub_device dev */ )
{
int ret ;
int tmp ;
ret = sub_gpio_write( dev, VSYNC, &tmp, VSYNC ) ;
ret = sub_gpio_write( dev, 0, &tmp, VSYNC ) ;
return( ret ) ;
}

void gen_soft_vsync()
{
soft_vsync = 1 ;
if( soft_vsync_on == 0 ) // Then disable the timer.
{
value.it_interval.tv_sec = 0 ;
value.it_interval.tv_usec = 0 ;
value.it_value.tv_sec = 0 ;
value.it_value.tv_usec = 0 ;
setitimer( which, &value, NULL ) ;
}
}

void wait_soft_vsync()
{
// Wait for soft vsync true
//while( soft_vsync == 0 ) ; // Wait for timer to set it
soft_vsync = 0 ; // Reset it and then return
}

int main()
{
int len, i ;
int count ;
int ret ;
// Before the main while loop, if Vsync is internally generated
// then start the timer
sigemptyset( &sact.sa_mask ) ;
sact.sa_flags = 0 ;
sact.sa_handler = gen_soft_vsync ;
sigaction( SIG_NUM, &sact, NULL );
value.it_interval.tv_sec = 0 ;
value.it_interval.tv_usec = 5000 ;
value.it_value.tv_sec = 0 ;
value.it_value.tv_usec = 5000 ;
setitimer( which, &value, NULL ) ;

if( (ret = open_SPI()) != 1 )
printf( "SPI not connected ret = %d\n", ret ) ;
if( sub_spi_config( dev,
SPI_ENABLE | SPI_CPOL_RISE | SPI_SMPL_SETUP |
SPI_MSB_FIRST | SPI_CLK_FQ | SPI_CLK_8MHZ, 0 ) != 0 )
printf( "sub_spi_config: %s", sub_strerror(sub_errno) ) ;
len = 40 ;
for( i = 0 ; i < len ; i++ )
{
bf_o = i ;
bf_i[0] = 0 ;
}
for( count = 0 ; count < 20000 ; count++ )
{
// Do real time stuff
wait_soft_vsync() ; // Wait
pulse_vsync() ;
// Write it to the SPI.
sub_spi_transfer( dev, bf_o, bf_i, len, SS_CONF(1, SS_LO) ) ;
}
soft_vsync_on = 0 ;
return(0);
}

Post by **xol** » Fri May 07, 2010 8:29 am

Hi,
I'm working with libusb development team on this issue.
They asked to check it with new Ubuntu 1.04 and libusb 1.0.8. Can you do it please?
(You can also try libusb 1.0.8 on Ubuntu 9.04)

XDIMAX Products Forum

ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)

Re: ubuntu 8.04 to ubuntu 9.04 (Hardy to Jaunty)