[Answered ]-What's the best way to optimize this MySQL query?

First of all, the SQL is badly formatted. The most obvious error is the line splitting before each AS clause. Second obvious problem is using implicit joins instead of explicitly using INNER JOIN ... ON ....

Now to answer the actual question.

Without knowing the data or the environment, the first thing I’d look at would be some of the MySQL server settings, such as sort_buffer and key_buffer. If you haven’t changed any of these, go read up on them. The defaults are extremely conservative and can often be raised more than ten times their default, particularly on the large iron like you have.

Having reviewed that, I’d be running pieces of the query to see speed and what EXPLAIN says. The effect of indexing can be profound, but MySQL has a “fingers-and-toes” problem where it just can’t use more than one per table. And JOINs with filtering can need two. So it has to descend to a rowscan for the other check. But having said that, dicing up the query and trying different combinations will show you where it starts stumbling.

Now you will have an idea where a “tipping point” might be: this is where a small increase in some raw data size, like how much it needs to extract, will result in a big loss of performance as some internal structure gets too big. At this point, you will probably want to raise the temporary tables size. Beware that this kind of optimization is a bit of a black art.

However, there is another approach: denormalization. In a simple implementation, regularly scheduled scripts will run this expensive query from time-to-time and poke the data into a separate table in a structure much closer to what you want to display. There are multiple variations of this approach. It can be possible to keep this up-to-date on-the-fly, either in the application, or using table triggers. At the other extreme, you could allow your application to run the expensive query occasionally, but cache the result for a little while. This is most effective if a lot of people will call it often: even 2 seconds cache on a request that is run 15 times a second will show a visible improvement.

You could find ways of producing the same data by running half-a-dozen queries that each return some of the data, and post-processing the data. You could also run version of your original query that returns more data (which is likely to be much faster because it does less filtering) and post-process that. I have found several times that five simpler, smaller queries can be much faster – an order of magnitude, sometimes two – than one big query that is trying to do it all.

staticsan

[Answered ]-Writing a Template Tag in Django

No index will help you since you are scanning entire tables.
As your database grows the query will always get slower.

Consider accumulating the stats : after every game, insert the row for that game, and also increment counters in the player’s row, Then you don’t need to count() and sum() because the information is available.

bobflux

[Answered ]-Django installation – trouble with creating a mysite directory

select * is bad most times – select only the columns you need
break the select into multiple simple selects, use temporary tables when needed
the sum(case part could be done with a subselect
mysql has a very bad performance with or-expressions. use two selects which you union together

codymanix

[Answered ]-AWS Elastic Beanstalk & Django 1.5: Collectstatic to S3 fails

Small Improvement

select *, (kills / deaths) as killdeathratio, (totgames - wins) as losses from (select gp.name as name, gp.gameid as gameid, gp.colour as colour, Avg(dp.courierkills) as courierkills, Avg(dp.raxkills) as raxkills, Avg(dp.towerkills) as towerkills, Avg(dp.assists) as assists, Avg(dp.creepdenies) as creepdenies, Avg(dp.creepkills) as creepkills, Avg(dp.neutralkills) as neutralkills, Avg(dp.deaths) as deaths, Avg(dp.kills) as kills, sc.score as totalscore, Count(1 ) as totgames, Sum(case when ((dg.winner = 1 and dp.newcolour < 6) or (dg.winner = 2 and dp.newcolour > 6)) then 1 else 0 end) as wins from gameplayers as gp, ( select * from dotagames dg1 where dg.winner <> 0 ) as dg, games as ga, dotaplayers as dp, scores as sc where and dp.gameid = gp.gameid and dg.gameid = dp.gameid and dp.gameid = ga.id and gp.gameid = dg.gameid and gp.colour = dp.colour and sc.name = gp.name group by gp.name having totgames >= 30 ) as h order by totalscore desc

Changes:
1. count (*) chnaged to count(1)
2. In the FROM, The number of rows are reduced.

copperstone

Source:stackexchange.com

Leave a comment Cancel reply