I have been wrestling, on and off, with a very mysterious crashing bug in an Android app of mine for quite a while now, and I’m hoping there may be light at the end of the tunnel.
What happens is, seemingly at random and extremely intermittently, the application will just stop and display a dialog saying “Unfortunately your bastard application has stopped” – or words to that effect.
I thought, “I’ll catch this, no worries” and added an UncaughtExceptionHandler ages ago.
The whole thing just bombs out and fails to catch any exceptions that might be causing this particular crash (has worked for other exceptions).
My next attempt was to put a ton of manual logging in – before, during and after every suspected event – but this became so unwieldy as to adversely affect the app’s performance, so I had to undo it all.
Now, onto my last ditch effort: I am temporarily diverting all logcat entries to a file on the SD Card so that I can read exactly what is going on just before the crash. With any luck it will die and surrender a full-on stack trace. Failing that, I will still hopefully see some tell-tale signs of what the actual problem is.
Wish me luck.
I’m just in the middle of a two-day bug hunt, and I’m poised to “make the kill” tonight when I get a chance to fully run the app, so I thought I would share my thoughts on two different styles of catching bugs: Stalking, and Laying Traps.
A lot of developers I’ve worked with prefer to catch a bug red-handed by stepping through the code with a debugging tool. I compare this approach to consciously stalking your “prey”, specifically chasing it down and remaining fully focussed throughout the process.
When you find the bug, you have to be careful to capture the information you require before it (possibly) gets lost, and you must be careful not to step past it or you might end up back at square one. This can be a frustrating experience, especially if the bug is intermittent or generally hard to reproduce.
My preferred method is to lay some traps. I add logging (or, more usually, highly temporary System.out.println calls) such that there is no way the bug can execute without revealing the exact pathway it took through the code. These are like traps that you can return to at your leisure and check if you’ve caught anything.
I prefer this for a couple of reasons: 1) I don’t have to be “on the ball” when the code is run, I can just pick up the output whenever I’m ready and – if I’ve done it right – it will definitely reveal how (or at least where) things are going wrong, and 2) This is the only way to reliably debug a multi-threaded application.
So, I’ve got my traps set for tonight. This is probably my favourite part of catching bugs – with everything set up and ready to snare the little bastard that’s been annoying me for a while. The actual fix is never as much fun.
I’ve been doing some optimisation and “streamlining” of the Gee-Oh! Android library code, including doing away with an over-engineered “in memory” store of grid squares, and it has come with its own (not entirely unexpected) problems.
A good thing is that I’ve realised the kickstartFailedDownloads() method has been masking a multitude of sins.
Anyway, I now find myself in the process of re-wiring the core grid code quite heavily. I’m currently trying to figure out why – after animating to location – we are left with a partially populated grid where some grid squares never seem to get past the “initialised” state.
So this means a ton of System.out.println nonsense (who uses proper logging??) that I will remove once the investigation is complete. Got to get to the bottom of this!
After I’ve definitely fixed all the issues that have been previously masked, I will plug kickstartFailedDownloads() back in and try to tie up any further loose ends with the grid setting code, post-“in memory grid removal”. I’m sure it will all go smoothly…
Oh my God, finally. I’ve been chasing after a bug – on and off – for months now, and at last I cracked it today.
In the new MakingTracksGPS, there was this minor but very irritating bug where the location marker would periodically jump back a little bit during a pinch/zoom event. I tried so many fixes for this, absolutely none of which made a blind bit of difference. Until today.
I noticed that the rendering jump occurs quite reliably if you zoom in/out, then just keep your fingers still on the screen. So it wasn’t due to the actual zoom event, but rather something that wasn’t being reset until you let go.
Upon examining the ACTION_UP code block, I noticed a variable tempDisableInterpolation being set to false, so I followed its behaviour and – lo and behold – when this remains true, it results in incorrect calls to centre display on (effectively an incorrect) location.
I did a bit of refactoring, altered (ie. improved) the general behaviour of interpolated zoom, and now there is zero trace of this tacky, jumpy display error.
Now I’m down to four things on my “Barriers To Release” list:
- UniqueID DB table(s) and handling/monitoring the recording of usage stats
- Placenames rendering
- Pre-population of map tiles on GeeOhTileServer
- Need to be conscious of storage limitations
- ProGuard builds
And that will be it, I’ll be shoving the v2.0 Beta out the door ASAP.
I’m still stuck on this smoothness/interpolation thing. Just can’t seem to get it right.
I’ve tried using a stack of points, and that’s ended in failure, so now I’m trying to use the last three points (which are explicitly defined in the code already) and hope that there is enough leeway in the incoming data to allow fully smooth interpolation.
It’s not going well.
I keep feeling like I’m so close to a solution, but it keeps ending in disappointment. I’m now contemplating revisiting my attempts as using a stack. FML.
I’m having yet another crack at implementing genuinely smooth interpolated movement, and this time I’m trying to use a stack to make sure that the data is ready to be consumed at the point it is required – as opposed to creating an artifical pause at the start of every cycle while some heavy processing takes place.
It’s proving more difficult than I imagined, but I’m sure that truly smooth rendering is almost within my grasp.
After I’ve conquered this particular problem, I’m going to take a little break from developing Gee-Oh! / MakingTracksGPS, and probably a break from development altogether (except for my day job).
Need to recharge.
I’ve generally been making good progress in recent months, but lately I feel like it’s one step forwards, two steps back when working on some key aspects:
- Smoothness of interpolation
- Still seem to be getting this rush to the last point, followed by pause
- Should be no excuse for this pause, it should be 100% smooth
- Need to aim towards the n-1 position
- Thread manager efficiency
- The current switchThreadManagers() functionality is causing a lot of “hangover” connection attempts etc. and it’s having a visual impact on performance – causing a pause every 5 or so seconds
- Need to implement a short timeout or “killable” download mechanism
- Instead of waiting for a long timeout to realise that something has gone wrong
- Rendering jumpiness
- This still appears to rear its head every now and then
- I believe this is fix in cases where !alwaysCentreOnCurrentPoint
- Still need to fully examine behaviour when centred
- The thread manager fix may have helped/solved this
So yeah, that’s the general mindset I’m in. I have to draw a line under these three key issues before I’m happy enough to move on and pick some of this low-hanging fruit:
- Finish off theming
- Auto-centre after theme switching
- Fix location discovery logic so that any new manual waypoint is added to the reduced list immediately
- Get better discovery sound
And then I will get stuck into some of this:
- Fix issue where grid squares above are being cleared, without having been properly replaced by the correct level below
- Finish off auto-update API work
- Improve offline download persistence (don’t just discard after 20)
- Design/implement mechanism to allow grid squares to be refreshed
That should cover it for a while. Ta ta!