NT Crash Thread

Post by **Xinux** » Mon Jan 12, 2015 3:26 pm

Ok new lockup not a crash MySQL server has gone away.

There was another client further up the log spewing packets that don't even look like Vanguard packets.

I'm attaching the log here.

[The extension log has been deactivated and can no longer be displayed.]

Post by **Lokked** » Mon Jan 12, 2015 4:11 pm

Did some searching on this error. There are a couple possibilities, but I'm thinking it might be this:
You can also encounter this error with applications that fork child processes, all of which try to use the same connection to the MySQL server. This can be avoided by using a separate connection for each child process.

Post by **Lokked** » Mon Jan 12, 2015 4:15 pm

Again, this user's post describes our situation, but only !!potentially!!
I'm not at home to look at the implementation of the MySQL connector in our code. If it's protected by Mutex, then this multithreading idea being the culprit is out the window:

Be aware of multi-threading.

I have used a c++ singleton class to encapsulate the mysql C API.
Using the same 'instance' of the class in two separate threads(or sharing the same mysql connection between threads), has raised this kind of errors (CR_SERVER_GONE_ERROR or CR_SERVER_LOST) when one thread ,which runs every 30 seconds, execute a query or a SQL command in the same time of the other thread.

Post by **Lokked** » Mon Jan 12, 2015 4:16 pm

Another possibility: Persistent Connections. Doesn't make sense why the original devs would turn this on, but I'll look at it:

I ran into this same error, and as a previous comment noted, and I think bears repeating, was eliminated when I stopped using persistent connections to connect to the MySQL database.
That is, I changed my "PDO::ATTR_PERSISTENT => true" settings to "PDO::ATTR_PERSISTENT => false" and the problem went away.
And there was great rejoicing.

Post by **Lokked** » Mon Jan 12, 2015 4:19 pm

This is an interesting post:
http://stackoverflow.com/questions/1455 ... ncurrently

Post by **Lokked** » Mon Jan 12, 2015 4:22 pm

More information:

3.16. Concurrent Queries on a Connection

An important limitation of the MySQL C API library — which MySQL++ is built atop, so it shares this limitation — is that you can only have one query in progress on each connection to the database server. If you try to issue a second query while one is still in progress, you get an obscure error message about “Commands out of sync” from the underlying C API library. (You normally get this message in a MySQL++ exception unless you have exceptions disabled, in which case you get a failure code and Connection::error() returns this message.)

There are lots of ways to run into this limitation:

• The easiest way is to try to use a single Connection object in a multithreaded program, with more than one thread attempting to use it to issue queries. Unless you put in a lot of work to synchronize access, this is almost guaranteed to fail at some point, giving the dread “Commands out of sync” error.

• You might then think to give each thread that issues queries its own Connection object. You can still run into trouble if you pass the data you get from queries around to other threads. What can happen is that one of these child objects indirectly calls back to the Connection at a time where it’s involved with another query. This is properly covered elsewhere, in Section 7.4, “Sharing MySQL++ Data Structures”.)

• One way to run into this problem without using threads is with “use” queries, discussed above. If you don’t consume all rows from a query before you issue another on that connection, you are effectively trying to have multiple concurrent queries on a single connection. Here’s a recipie for this particular disaster:
UseQueryResult r1 = query.use("select garbage from plink where foobie='tamagotchi'");
UseQueryResult r2 = query.use("select blah from bonk where bletch='smurf'");

The second use() call fails because the first result set hasn’t been consumed yet.

• Still another way to run into this limitation is if you use MySQL’s multi-query feature. This lets you give multiple queries in a single call, separated by semicolons, and get back the results for each query separately. If you issue three queries using Query::store(), you only get back the first query’s results with that call, and then have to call store_next() to get the subsequent query results. MySQL++ provides Query::more_results() so you know whether you’re done, or need to call store_next() again. Until you reach the last result set, you can’t issue another query on that connection.

• Finally, there’s a way to run into this that surprises almost everyone sooner or later: stored procedures. MySQL normally returns at least two result sets for a stored procedure call. The simple case is that the stored procedure contains a single SQL query, and it succeeds: you get two results, first the results of the embedded SQL query, and then the result of the call itself. If there are multiple SQL queries within the stored procedure, you get more than two result sets. Until you consume them all, you can’t start a new query on the connection. As above, you want to have a loop calling more_results() and store_next() to work your way through all of the result sets produced by the stored procedure call.

Post by **John Adams** » Mon Jan 12, 2015 6:21 pm

[quote="Lokked"]You can also encounter this error with applications that fork child processes, all of which try to use the same connection to the MySQL server. This can be avoided by using a separate connection for each child process.[/quote]
I do not believe this is us. Yes, we run multiple threads, but I made such a stink about not tapping the database constantly (for many reasons, including this one) that I do not think we're doing much outside SaveCharacter -- which is usually ALWAYS where this is bombing out.

I have tried to fix the convoluted way we have implemented SaveCharacter, by putting more logging and less bool checks on every line of code (wtf) in the hopes that the Loggers might tell us where they are when they shit the bed. But lately, I literally see a player being saved just fine and the next round, boom. MySQL has gone away.

It's definitely not max_packet_size, mine is 100MB. It **should not** be my network connection, because all machines are on the same ESXi Host; ie., there is no physical switch or copper, it's all virtual. The ONE thing I am leaning towards is the hesitancy in the Web Server, when you search or click on anything here, there should be ZERO delay loading the phpBB3 pages -- but there is a bad delay sometimes. That delay could be exacerbating the SQL issues from the VGOEmu VM. Meaning, if my LAMP machine is stuttering due to bad disk or NIC or whatever, and VGOEMu is trying to ask SQL for something, even a 10ms "I'm not here now!" message in the connector could cause it to simply panic.

This is why I am suggesting we build our Database class smarter, to try again vs quitting like a quitter on the first attempt and fail. theFoof made a mysql_ping() heartbeat for me long ago, for this very reason.

I cannot address Persistence. I have no idea why it would need to be. In theory, the world starts, reads SQL, and it's done -- forever. Until that player logs in, then gets their data. Then it's done. FOREVER. Until the timer says, SaveCharacter. Then it's done. FOREVER.

That's what I'm after. Not dragging hotbar buttons and seeing a INSERT and DELETE happen at the same time that action is happening. We need to get away from hitting the DB at the time an action takes place in the world. Period. But that's another story.

Post by **John Adams** » Mon Jan 12, 2015 6:23 pm

Oh, what I started out saying before my tangent; EQ2Emu, when we added "Threaded Startup" (meaning, systems spawned a thread and loaded there as to not halt all processing) -- it was at that time, we had to spawn new MySQL connections for each thread that was PULLING data from the database. That definitely makes sense to me.

We shouldn't be doing this. If we are, it's wrong. (at least until I ask you for threaded startup )

Post by **Lokked** » Tue Jan 13, 2015 12:07 am

I've committed something to try, for no better reason than nothing else makes sense. I've put a mutex around the query statements. I'm able to log in, rift, camp, come back, etc.

Post by **John Adams** » Tue Jan 13, 2015 3:00 pm

It's live. May god have mercy on your soul.

VGOEmulator.net

NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread

Re: NT Crash Thread