I'm working on a simple, light-weight forum system using PHP and MySQL with the following database structure (wip):
[img]http://i49.tinypic.com/33k62l3.png[/img]
(haven't optimised data types yet)
On my viewforum.php page, I need to display the author and the last post as you would in any forums and was wondering if it's possible to do in a single query.
Currently I'm using this:
[php]mysql_query("SELECT t.*, u.* FROM threads t, posts p, users u WHERE t.forum_id = '$forum_id' AND p.post_id = t.thread_first_post_id AND u.user_id = p.user_id ORDER BY t.thread_id DESC");[/php]
And looping through the results with mysql_fetch_array, but I can't resolve the last posters user_name with this method. Is there a better alternative - or should I use multiple queries or store the first + last posters username and id in the threads table too and have duplicated data?
I'm new to the SQL syntax and don't have much (i.e any) experience with table aliasing and unions.
[code]
SELECT user_name FROM users WHERE user_id = (SELECT user_id FROM posts WHERE thread_id = ### ORDER BY post_id DESC LIMIT 1)
[/code]
also, the following columns are redundant:
forum_posts
forum_threads
forum_last_post_id
thread_first_post_id
thread_replies
thread_last_post_id
I'll post the replacement for those tomorrow, because typing SQL on an iPhone is a bitch
I tried using:
[php]$result = mysql_query("SELECT t.*, (SELECT u.* FROM posts p, users u WHERE p.post_id = t.thread_first_post_id AND u.user_id = p.user_id) AS firstpost, (SELECT u.* FROM posts p, users u WHERE p.post_id = t.thread_last_post_id AND u.user_id = p.user_id) AS lastpost FROM threads t WHERE t.forum_id = '$forum_id'");[/php]
but I'm getting the following error: Invalid query: Operand should contain 1 column(s)
[QUOTE=turby;19587332]also, the following columns are redundant:
forum_posts
forum_threads
forum_last_post_id
thread_first_post_id
thread_replies
thread_last_post_id[/QUOTE]
forum_posts and forum_threads are for storing a count of the number of post\threads in that forum section. Similarly, thread_replies is for keeping track of the number of replies in that thread. Surely this would be more efficient than calling COUNT every time we want to get the number of replies?
As for forum_last_post_id, thread_first_post_id and thread_last_post_id it's also for keeping track of earliest\most recent posts without needing to have MySQL scan through the tables searching for the earliest and most recent posts. Once again, wouldn't that be more efficient?
Also just so you know, I'm working with him on this and not just jumping in here :smile:
[editline]10:08PM[/editline]
[QUOTE=Shootfast;19587397]I tried using:
[php]$result = mysql_query("SELECT t.*, (SELECT u.* FROM posts p, users u WHERE p.post_id = t.thread_first_post_id AND u.user_id = p.user_id) AS firstpost, (SELECT u.* FROM posts p, users u WHERE p.post_id = t.thread_last_post_id AND u.user_id = p.user_id) AS lastpost FROM threads t WHERE t.forum_id = '$forum_id'");[/php]
but I'm getting the following error: Invalid query: Operand should contain 1 column(s)[/QUOTE]
I wrote that query so forward any PEBKAC\"programmer is an idiot" errors about that one to me.
[QUOTE=mechanarchy;19587416]forum_posts and forum_threads are for storing a count of the number of post\threads in that forum section. Similarly, thread_replies is for keeping track of the number of replies in that thread. Surely this would be more efficient than calling COUNT every time we want to get the number of replies?
As for forum_last_post_id, thread_first_post_id and thread_last_post_id it's also for keeping track of earliest\most recent posts without needing to have MySQL scan through the tables searching for the earliest and most recent posts. Once again, wouldn't that be more efficient?[/QUOTE]
storing the number of threads might be marginally faster than COUNT() but it adds unneeded complexity. Also, you have a potential race condition when adding or removing threads. Much better to just let MySQL handle it the way it should be handled, rather than dodgy old in-DB caches.
I'm a big fan of the philosophy that as much work as is reasonable should be done on the database rather than the webapp itself.
[QUOTE=turby;19587471]storing the number of threads might be marginally faster than COUNT() but it adds unneeded complexity.
I'm a big fan of the philosophy that as much work as is reasonable should be done on the database rather than the webapp itself.[/QUOTE]
I don't see how it would be more complex for the webapp - Either way, whether you call COUNT every time you do it or just use the stored value and called COUNT when you update, you're going to have a query using "COUNT(...)" at some point. It just makes it more efficient if you've got a lot of read-only queries than can just return the static value than ones that are constantly requesting the number of rows.
The only change is you call COUNT when you're storing the new values, so when you're doing a write-query (which there's bound to be less of anyway). In the end, it just seems easier and more efficient to store the static value in my opinion. Though, I'm no expert so if you're sure that's a better way we'll consider it and weigh the pros and cons.
I can see where you're coming from, and it's very well meaning, but flawed.
I'll put together a few arguments for my way tomorrow when I'm not on my iPhone, so you can make an informed decision.
[QUOTE=turby;19587510]I can see where you're coming from, and it's very well meaning, but flawed.
I'll put together a few arguments for my way tomorrow when I'm not on my iPhone, so you can make an informed decision.[/QUOTE]
Alright, thanks. And as for the comment about race-conditions, by default writes are prioritised higher than reads so I don't see how that would be an issue. Though, that is a very good concern so I'll see when you pull some thoughts together :)
[QUOTE=mechanarchy;19587519]Alright, thanks. And as for the comment about race-conditions, by default writes are prioritised higher than reads so I don't see how that would be an issue. Though, that is a very good concern so I'll see when you pull some thoughts together :)[/QUOTE]
Oh, I was thinking about inserts and deletes and how the update on the count column - which (assuming you're using PHP) would have to be a seperate query. Once your action - inserting a row and updating the count value - stops being atomic, you have a potential for a race condition to occur. Even though it wouldn't be very major, it would still be there.
[QUOTE=Shootfast;19587112]
[img]http://i49.tinypic.com/33k62l3.png[/img]
[/QUOTE]
What program is this?
[QUOTE=nivek;19594253]What program is this?[/QUOTE]
I think he said it was PHPMyAdmin's Designer view. I tried finding it on PHPMyAdmin on our host but apparently it's not installed. He said he was hosting it locally while in development so I guess that explains it all.
[QUOTE=mechanarchy;19595264]I think he said it was PHPMyAdmin's Designer view. I tried finding it on PHPMyAdmin on our host but apparently it's not installed. He said he was hosting it locally while in development so I guess that explains it all.[/QUOTE]
[url]http://wiki.phpmyadmin.net/pma/relation#Designer[/url]
Let's you create relationships between tables.
[img]http://i50.tinypic.com/2ljot1f.png[/img]
The blue numbers are then hyperlinks to the proper record, and when inserting data there's a drop down box of available options.
But back onto the question at hand!
[QUOTE=Shootfast;19587397]I tried using:
[php]$result = mysql_query("SELECT t.*, (SELECT u.* FROM posts p, users u WHERE p.post_id = t.thread_first_post_id AND u.user_id = p.user_id) AS firstpost, (SELECT u.* FROM posts p, users u WHERE p.post_id = t.thread_last_post_id AND u.user_id = p.user_id) AS lastpost FROM threads t WHERE t.forum_id = '$forum_id'");[/php]
but I'm getting the following error: Invalid query: Operand should contain 1 column(s)[/QUOTE]
[php]$g_topics = mysql_query("SELECT t.*, u.*
FROM threads AS t
INNER JOIN posts AS p ON (t.last_post_id = p.post_id)
INNER JOIN users AS u ON (p.user_id = u.user_id)
WHERE t.forum_id = " . $forum_id . "
ORDER BY t.thread_id DESC");[/php]
That should work.
Two notes:
1. You should probably add last_post_time to your thread table, so you can sort on that (instead of sorting on thread_id)
2. INT's should have a length on 10, not 11. So in your table structure it should be INT(10) not INT(11), also make sure they're UNSIGNED. More information about INT and UNSIGNED: [url]http://dev.mysql.com/tech-resources/articles/visual-basic-datatypes.html[/url]. You'll notice that making your datatype UNSIGNED means that it will only allow positive numbers (0 and up) and will allow twice the amount of rows (which you probably don't need, but it's good to keep it in mind).
[editline]12:04AM[/editline]
[QUOTE=turby;19587548]Oh, I was thinking about inserts and deletes and how the update on the count column - which (assuming you're using PHP) would have to be a seperate query. Once your action - inserting a row and updating the count value - stops being atomic, you have a potential for a race condition to occur. Even though it wouldn't be very major, it would still be there.[/QUOTE]
I'll take my chances with having a potential race condition over a database that is getting killed with COUNT queries that are called every pageview anytime. Seriously, having a perfectly normalized database is not always a good thing.
Why do you think big forums like vBulletin / IPB / phpBB / etc. all store information like this statically. If you have 1.000 users online, that means that there's an extra 1.000 queries that will have to be executed every pageview when there is absolutely no need to. Information about post count is not vital, at all and will only add extra load to your database server.
Fixed using this lovely piece of code:
[php]$result = mysql_query("SELECT t.*, first_post.*, first_user.*, last_post.*, last_user.* FROM threads t
INNER JOIN posts AS first_post ON (t.thread_first_post_id = first_post.post_id)
INNER JOIN users AS first_user ON (first_post.post_user_id = first_user.user_id)
INNER JOIN posts AS last_post ON (t.thread_last_post_id = last_post.post_id)
INNER JOIN users AS last_user ON (last_post.post_user_id = last_user.user_id)
WHERE t.forum_id = '$forum_id'
ORDER BY t.thread_id DESC");[/php]
Will fix the data types at a later point in development.
We're up to writing the code to post a new thread.. and then realised what turby was talking about.
We need to post a new thread and then a new post, though by the looks of things we'll need 3 queries - one to post the thread, another to post the post (using the thread's id), and then a third to update the thread's first and last_post_id with the new post.
Can this be done in 2 queries, or in a better way?
[QUOTE=MD1337;19597122]
2. INT's should have a length on 10, not 11. So in your table structure it should be INT(10) not INT(11), also make sure they're UNSIGNED. More information about INT and UNSIGNED: [url]http://dev.mysql.com/tech-resources/articles/visual-basic-datatypes.html[/url]. You'll notice that making your datatype UNSIGNED means that it will only allow positive numbers (0 and up) and will allow twice the amount of rows (which you probably don't need, but it's good to keep it in mind).[/QUOTE]
[QUOTE=Shootfast;19587112](haven't optimised data types yet)[/quote]
Yeah, we know. As soon as we're at a stage where we don't think the tables are going to change much we'll start fixing up the data types.
[QUOTE=MD1337;19597122]
Why do you think big forums like vBulletin / IPB / phpBB / etc. all store information like this statically. If you have 1.000 users online, that means that there's an extra 1.000 queries that will have to be executed every pageview when there is absolutely no need to. Information about post count is not vital, at all and will only add extra load to your database server.[/QUOTE]
Having 1000 users online does not mean an extra 1000 queries - you can join COUNT statements on and alias them and get it all in one go. It's more the database having to count the rows that's the issue in this case.
[QUOTE=MD1337;19597122]I'll take my chances with having a potential race condition over a database that is getting killed with COUNT queries that are called every pageview anytime. Seriously, having a perfectly normalized database is not always a good thing.
Why do you think big forums like vBulletin / IPB / phpBB / etc. all store information like this statically. If you have 1.000 users online, that means that there's an extra 1.000 queries that will have to be executed every pageview when there is absolutely no need to. Information about post count is not vital, at all and will only add extra load to your database server.[/QUOTE]
I went and did my homework, and it turns out that if he does a COUNT(id) rather than a COUNT(*) (as I initially suggested), there will not be a performance impact as mysql caches that data anyway
Sorry, you need to Log In to post a reply to this thread.