Annoying Problems With std::vector on Linux Vs. Windows
So I spent most of this weekend working on getting my engine to compile correctly under GCC/Linux (Ubuntu 10.4) and ran into some serious headaches. It’s been a few years since I last built anything on Linux at all (last time was when I did some work for Epic Interactive and their linux ports) and took me a little bit to remember how makefiles worked etc. and bring my SDL code up to date.
After compiling the code, I was having a hell of a time trying to figure out why it would always cause a segmentation fault after the first few seconds of running. Here is what gdb had to say:
[Thread debugging using libthread_db enabled] [New Thread 0xb3da2b70 (LWP 4813)] [Thread 0xb3da2b70 (LWP 4813) exited] [New Thread 0xb3da2b70 (LWP 4815)] Program received signal SIGSEGV, Segmentation fault. 0x0807c27d in ?? () (gdb) backtrace #0 0x0807c27d in ?? () #1 0x08083eee in ?? () #2 0x0807e59f in ?? () #3 0x08082231 in ?? () #4 0x08062d98 in ?? () #5 0x0804deaa in ?? () #6 0x0806659b in ?? () #7 0x00369bd6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6 #8 0x08049891 in ?? ()
Unfortunately, that doesn’t really tell me much of anything 🙂 After a little bit of playing around with the manual pages in GDB, and some serious googling, I was able to get the app to compile with some debug symbols. This is the output:
[Thread debugging using libthread_db enabled] [New Thread 0xb3da2b70 (LWP 3711)] [Thread 0xb3da2b70 (LWP 3711) exited] [New Thread 0xb3da2b70 (LWP 3713)]
/usr/include/c++/4.4/debug/vector:265:error: attempt to subscript container with out-of-bounds index 0, but container only holds 0 elements.
Objects involved in the operation: sequence "this" @ 0x0xb162bf4 { type = NSt7__debug6vectorImSaImEEE; }
Program received signal SIGABRT, Aborted. 0x0012d422 in __kernel_vsyscall () [Thread debugging using libthread_db enabled] [New Thread 0xb3da2b70 (LWP 3711)] [Thread 0xb3da2b70 (LWP 3711) exited] [New Thread 0xb3da2b70 (LWP 3713)] /usr/include/c++/4.4/debug/vector:265:error: attempt to subscript container with out-of-bounds index 0, but container only holds 0 elements.
Objects involved in the operation: sequence "this" @ 0x0xb162bf4 { type = NSt7__debug6vectorImSaImEEE; }
Program received signal SIGABRT, Aborted.0x0012d422 in __kernel_vsyscall ()
This error is generated whenever I try and set up Vectors in the code to handle some rather simple arrays. I even tested the code where I created a vector, and checked its size to see if it was 0 and was still able to create the same problem. This is how I did all the safety checks in my code; if Vector.size() > 0 { Safe To Do Some Stuff, If Size Was Legal For Array }.
So, for whatever reason, they behave very differently than in Windows. I really don’t have the time to figure out the specific reasons, though after a couple of hours of googling trying to get GDB to display as much information as it did, it turns out a lot of people have issues with vectors doing strange things on Linux. As a result, I will be removing that code from the project to make sure it’ll compile and run on all 3 platforms (Windows, Mac and Linux) without any further issues! Lucky for me, it is an easy fix!
Just a quick update 🙂
I removed all of the vector code in the engine today and replaced the arrays with fixed values (Example: int ThisArray[MAX_VALUE]; ) and voila! All issues vanished completely!
It still runs as expected on Windows and performs exactly as expected on Linux, which is good to know as I was doubting my own code for a little bit there.
Im still a bit baffled as to why it behaves so differently, I mean, if the vector is empty ands queried with size() it shouldn’t shit it out like it did do. All I can put it down to is some wierd-ass override somewhere in the GCC setup on my box, despite it being vanilla.
Now, off to get this thing finished. Im running out of time!