Auria {l Wrote}:It's a bit hard to be fully sure since I do not have this crash myself, but I believe I have fixed the crash you encountered. If you know how to use the latest git version, you could try it; otherwise, the fix will come with next release
Thanks for the quick response! I finally also got some time to give it a try and i got myself into a bughunt
Long story short, the fix is not enough but close, and the problem is actually pretty elusive.
So the (very) long story:
Let me tell you, this bug sucks, bigtime. I think (in fact guess) the root cause is a bug in irrlicht, with multiple coincidences in STK leading to the crash.
I cloned the latest git version, built it, ran it and there was no crash, controller worked like charm. I could have ended the check here, but I'm used to reproduce an error, fix it and then verify the fix is working (sorry, bad habit
), therefore I fetched the 0.9.3 sources, built it, ran it and there was no crash, controller worked like charm :-o
I compiled all 4 build configurations, all worked. My installed STK still crashed. I installed on another machine, crash. This time I noticed that I installed x64, while I compiled x86. OK, I got it.
So I compiled 0.9.3 x64, ran it, and there was no crash, controller worked like charm...
I went on, compiled all build configurations, and "finally" Release and MinSizeRel had the crash. Of course, exactly the ones without debug info.
My first question here: The installed STK crash dialog had debug info, even line numbers. I could only reproduce the problem with Release builds without debug info. Which build configuration is packaged in the installer?
So I went back to the latest sources and compiled x64 Release to see if the fix fixed, but it did not. I had no debug info and the exception message in the debugger said:
Exception thrown at 0x00007FF65A274315 in supertuxkart.exe: 0xC0000005: Access violation reading location 0x0000000000000000
Looks like a null pointer dereference, not much info.
I started logging around and found that the code now passed GamePadDevice::isButtonPressed (in the new else branch), but the caller method InputManager::input in line 959 now called GamePadDevice::setButtonPressed which crashed the same way. I added the same boundary check to setButtonPressed and then there was no more crash.
So here's what I found out:
I do not know the code and I might well miss the context or historical view, but I think that this boundary check is a symptomatic treatment that hides an inconsistency in the handled controller data.
I found that the controller in question is reported twice by Irrlicht, I do not know whether this is a bug or not. The first device is reported with 16 buttons, the second one with 0 buttons. The difference is that in the good case Irrlicht reports them with HasGenericName == true, whereas in the bad case (Release) they have HasGenericName == false, and I guess that this is an Irrlicht bug, because app build mode should not affect the gamepad driver.
So the conditions for the crash are:
- two gamepads with the same name (e.g. "Generic Joystick" or whatever)
- HasGenericName is false for them
- the second one has less buttons (let's call them bc1 and bc2 below, so bc1 > bc2)
I do not know how likely this is, especially regarding the second one.
There are 3 places in STK code that end up in the crash:
1.
DeviceManager::initialize enumerates the gamepad devices and only if HasGenericName is true, it appends the ID to the name. In the bad case this flag is false, so the local "name" variable will have the same value for both gamepads. Consequently getConfigForGamepad() will pick the _same instance_ of GamepadConfig for them. The loaded configuration will have the default 0 button count in it.
2.
This same GamepadConfig instance is passed to both constructors when the two GamePadDevice objects are created. The constructors also receive the actual number of buttons, as reported by Irrlicht.
Right at the beginning the actual number of buttons (button_count) is set into the configuration, but only if the button count in the configuration is less than the actual button count.
When the first GamePadDevice object is created button_count is bc1 and the button count in config is 0. 0 is less than bc1, therefore the button count in config is set to bc1.
When the second GamePadDevice object is created button_count is bc2 and the button count in the config is bc1 (remember, same config instance as in the previous constructor call), as bc1 > bc2 in our preconditions, the configuration's button count is not updated and remains bc1. Here we have the inconsistency in the button count housekeeping.
3.
In the constructor the m_button_pressed array holding the button states is sized to button_count (bc2).
GamePadDevice::getNumberOfButtons() returns the button count of the configuration (bc1) which is now greater than the size of m_button_pressed.
Therefore the user of the GamePadDevice object (InputManager) will try to access the buttons up to bc1 (with GamePadDevice::isButtonPressed), but the button state array ends at bc2 and above that we address out of the array.
So current fix with the boundary checks will solve this by faking unpressable buttons above bc2, but I think it'd be worth to rather eliminate the root cause(s) as it would also make the code simpler.
Here are my suggestions based on quite local knowledge of these few source files, so handle with care
1. Currently in DeviceManager::initialize the ID of the gamepad is only attached to its name if HasGenericName is true. The configs are mapped to devices based on this device name.
I'd suggest to separate the display name from the name that is used to identify the device. The latter could always use the ID in the name so that it is unique.
2. m_axis_count and m_button_count and related methods in GamepadConfig are not used by the class. Other classes, i.e. GamePadDevice and WiimoteManager use these to store button and axis counts but they have their own ways as well.
GamePadDevice has the above mentioned redunant handling and therefore can use the received button count internally.
WiimoteManager sets the number of buttons and axes in the config but Wiimote does not use these values at all, instead it sets the same values to its GamePadDevice.
Here I miss why config->setNumberOfButtons is called only when the config button count is smaller than the actual. Were there some situations that requered this logic?
3. Keeping the boundary checks in GamePadDevice::[is|set]ButtonPressed is still a good idea, as these are public members and even if the number of buttons are reported properly a caller can pass any index.
Please find the sources I changed attached.
Thanks for the support!
Regards,
Gabor