optimizing code ii


Working on my website PHP code off-line out of the wild... Still.

The keywords facility (as I refer to it) has been optimized further.
Remember that:
  "The keywords page - without any keyword refinement - currently
  produces  717 links."


717 links is perhaps rather too much choice. Who's going to notice if
keywords only used once are not displayed? That's 333.But who's going
to miss those keywords which are only used twice? 230. Thrice? 190.

That's a reduction of 527 links, but a reduction which optimizes for
the user as well as the server by filtering out extraneous information.

It also means that the page generation time on:

[sirrom@scrapyard ~]$ uname -a
Linux scrapyard 3.0-ARCH #1 SMP PREEMPT Fri Oct 7 11:35:34 CEST 2011
x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel
GNU/Linux

is reduced further from 8s to 0.5s and now, to 0.13s.

Tell a lie. I've missed an intermediate step which got the time down to
0.13s. A call to exec which ran GNU make every time the keywords were
used. Make ran grep to create a file containing each page name and the
keywords defined for it. Because pages to jwm-art.net are not regularly
added, it was a waste of time continually call exec to run make to run
grep. Additionally, because adding pages generally also requires SSH
access, the call to run make may as well be performed manually.

The generated file looks somewhat like this:

../dat/apple.dat:keywords=fruit,Tree,round
../dat/banana.dat:keywords= fruit,tree,tropical
../dat/carrot.dat:keywords=vegetable, root,orange

(note:notice capitalization and white space)

So every time the keywords page was requested, processing was required
to get the page name by removal of path and extension, and then the
keywords themselves by removing everything before the equals sign.

I decided that some of this processing such as whitespace removal
and de-capitalisation could be reduced by moving it into the Makefile.
This lead me on a regular expression learning quest and to sed. Sed is
great btw.

The Makefile looks somewhat like  this:

KEYLIST := keysdats.list
FILES := $(wildcard ../dat/*.dat)

$(KEYLIST): $(FILES)
	grep '^keywords' $^ | sed -f sed_clean_grep_output > $@

And running make using that Makefile produces the keywordpages.txt
file which now looks like this:

apple:fruit,tree,round
banana:fruit,tree,tropical
carrot:vegetable,root,orange

Which is achieved by using sed with expressions contained in the
sed_clean_grep_output file which contains:

s|.{2}/dat/||
s/.dat//
s/keywords ?= ?//
s#(^.*:)(.*)#1L2#

see:
http://stackoverflow.com/questions/779847/sed-change-case-of-substitution-group
http://regexlib.com/CheatSheet.aspx
http://www.regular-expressions.info/reference.html
http://www.robelle.com/smugbook/regexpr.html
http://www.grymoire.com/Unix/Sed.html
http://www.gnu.org/software/sed/manual/sed.html#The-_0022s_0022-Command
http://www.eng.cam.ac.uk/help/tpl/unix/sed.html

Now, what of the keyword results? Click the 'digital' keyword and you
used to be presented with over 200 results. That's 200 results
consisting of a thumbnail entry (thumbnail-image, title, and short
description), plus 250 (IIRC) characters of text culled from the
'information' section of each page wrapped in HTML <div> element using
the 'result' class.

First results optimization, remove the long description (ie the culled
text) and thus the need for the extra <div>, plus removal of CSS for
'result' class. With the removal of the HTML/CSS alone, that's a
reduction of 4.6kb for the results of 'digital' keyword.

Second results optimization is to only display ten results at a time.
It was purely out of laziness that I hadn't attempted this until now...
Perhaps.

That was the week before last.


Information

"optimizing code ii"

a follow-up text about optimizing PHP code for my website

Journal entry - 07:36 Saturday 12 November 2011

DISCLAIMER: The opinions and attitudes of James W. Morris as expressed here in the past may or may not accurately reflect the opinions and attitudes of James W. Morris at present, moreover, they may never have.

Comments

this page last updated:29th April 2013 jwm-art.net (C) 2003 - 2017 James W. Morris

script time:0.0450