SQLite retrieval corruption with massive amounts of data

This is the second time that I’ve had to deal with database corruption. I should warn those who make Lua scripts that store huge amounts of data in the SQLite database. And by many I mean a couple of thousand entries is enough to fuck things up.

Here’s the deal
You won’t be able to retrieve all the data if your table has a couple of thousand rows.

How to reproduce (so you can see for yourself)
[lua]
– First create a nice table
sql.Query(“CREATE TABLE testdata(test VARCHAR NOT NULL PRIMARY KEY);”)
– Insert some 10.000 things in it (obviously this will lag a bit)
for i = 1, i = 10000 do
sql.Query(“INSERT INTO testdata VALUES('testing”…i…"’);")
end

– See if it worked:
print(sql.QueryValue(“SELECT COUNT(*) FROM testdata;”))
– output for me: 10000

– Now here’s the problem:
local A = sql.Query(“SELECT * FROM testdata;”)
print(table.Count(A))
– output for me: 786

– Let’s try if it prints something else if we just try it again:
A = sql.Query(“SELECT * FROM testdata;”)
print(table.Count(A))
– output for me: 786

– Hmm, no change, what if we insert something and try again?
sql.Query(“INSERT INTO TESTDATA VALUES(‘abcd’);”)
local A = sql.Query(“SELECT * FROM testdata;”)
print(table.Count(A))
– output for me: 787
[/lua]

Some notes:

  • I seem to be able to retrieve a random part of the database, in random order
  • A contains both tables and strings (this is definitely corrupt)
  • In my test case, I have exactly ONE string in A

Make sure you're aware of this!

Some more test cases:
[lua]
– Count the amount of strings and tables in A
local tables = 0
local strings = 0

for k,v in pairs(A) do
if type(v) == “table” then
tables = tables + 1
end
if type(v) == “string” then
strings = strings + 1
end
end
print(tables, strings)
– Output for me: 785, 1

– Well then what is this string?
for k,v in pairs(A) do
if type(v) == “string” then
print(v)
end
end
– Output for me: “testing9216”, which is one of the random keys I inserted…
[/lua]

Can’t you just retrieve the data in steps with the LIMIT and OFFSET clause?
That seems to work, but it’s a really hacky workaround. It still runs in O(n) time, but the constant is higher because you’re running multiple SQLite queries.
For now, limits of 1000 seem to work:

[lua]
local A = {}
local count = sql.QueryValue(“SELECT COUNT(*) FROM testdata;”) or 0
for i = 0, count, 1000 do
local B = sql.Query(“SELECT * FROM testdata LIMIT 1000 OFFSET “…i…”;”)
for a,b in pairs(B or {}) do
table.insert(A,b)
end
end
print(table.Count(A))
– Output: 10000
[/lua]

My experience with this problem
I have a table in FPP which holds the blocked models. I need to retrieve all of them in order to block them. Server owners have reported that this breaks after inserting an X amount of models in it.

SQLite is lame. It’s useful but it has no interface for people who don’t understand mysql functions and statements.

If that is the case, then I’d say this could have something to do with the limit of temporary values ILuaInterface can hold (something like 1024). AzuiSleet ran into similar problems with some of his modules. Hell, even Spacetech with his gm_navigation.

It’s not about how lame it is, it’s the plain fact that it’s not working properly.

@FPTje, you should post that in the gm13 forum.

I think this is an error of the implementation in GMod, since the Sqlite3 command line tool works fine:



.output TEST.txt
SELECT * FROM testdata;


Output:
TEST.txt with 10.000 lines.

[editline]29th June 2012[/editline]

It’s not necessarily a gmod 13 problem. I haven’t tested this in GMod 13, only in GMod 12.

This is a problem with tables themselves, I came into this problem ages ago when the tmysql rank system I made for Assmod when I pretty much first leaned lua stopped working because we had over a thousand ranks. I used to load the whole ranks table when the script was ran, this seemed to me at the time like a better choice to me than running a query whenever a played joined.

player:Set/GetPData()?

Maybe the ILuaInterface, but pure Lua tables can have 10.000 or more entries. The problem is that sql.Query can’t retrieve that amount of rows.

It could be a bug in the SQLite implementation or in the ILuaInterface or something else, but it’s not the Lua tables.

It’s most likely that this issue was carried over to gm13, soo, as soon as you can test this, make a post or not.

I don’t disagree with it not being useful. I just don’t understand how someone would use it over a MySQL. It’s an argument anyone can argue for either side. I guess it’s just a matter of personal opinion.

Saving client settings is the most suitable use for it I guess.

Comes with GMod by default, if you’re making an addon it’s a lot easier to use something that every server definitely has rather than force them to install a binary module as well.

This is probably the temp overflow issue, although I can’t find the gm_sqlite code to confirm. Your limit of 1000 is a good hint though.

http://www.facepunch.com/showthread.php?t=1151512&p=34298139&viewfull=1#post34298139

I was coming across the same issue with SQLite. I think I made a topic about this a few years ago as well. The data in the table isn’t corrupt itself, its the SELECT that seems to be messing up the retrieval. I just worked around it by selecting data in batches of 1000.

[editline]30th June 2012[/editline]

[lua]
function db:GetCacheData()
self.Cache._cache = {}
self._cacheCount = 0

local cacheData = self:Query( "SELECT COUNT(*) FROM " .. self.dbName, false )
if cacheData[1] and cacheData[1]["COUNT(*)"] then
	self._cacheCount = tonumber( cacheData[1]["COUNT(*)"] )
end

if self._cacheCount < 1024 or !self._cacheCount then
	self.Cache._cache = self:Query( "SELECT * FROM " .. self.dbName, false ) or {}
	return
end

local tmp = {}
local neededPackets = math.ceil( self._cacheCount / 1000 )
print( "FEL --> " .. self.dbName .. " --> Contains more than 1024 entries!  Grabbing " .. neededPackets .. " packets." )

for I = 0, neededPackets - 1 do
	table.Add( tmp, self:Query( "SELECT * FROM " .. self.dbName .. " LIMIT " .. I * 1000 .. ",1000", false ) )
end

self.Cache._cache = table.Copy( tmp )

end
[/lua]