* [sup-talk] [PATCH] cache results of Person.from_address
@ 2009-08-17 6:39 Rich Lane
2009-08-17 13:07 ` Andrei Thorp
2009-08-22 14:10 ` William Morgan
0 siblings, 2 replies; 5+ messages in thread
From: Rich Lane @ 2009-08-17 6:39 UTC (permalink / raw)
The regexes in this function are very expensive, so caching improves
performance significantly for queries and slightly for indexing.
---
lib/sup/cache.rb | 46 ++++++++++++++++++++++++++++++++++++++++++++++
lib/sup/person.rb | 7 ++++++-
2 files changed, 52 insertions(+), 1 deletions(-)
create mode 100644 lib/sup/cache.rb
diff --git a/lib/sup/cache.rb b/lib/sup/cache.rb
new file mode 100644
index 0000000..0836dbd
--- /dev/null
+++ b/lib/sup/cache.rb
@@ -0,0 +1,46 @@
+class Cache
+ def initialize n=128, i=3
+ @n = n
+ @i = i
+ @values = {}
+ @marks = {}
+ @delete_stack = []
+ end
+
+ def [](k)
+ if @values.member? k
+ @marks[k] = @i
+ @values[k]
+ else
+ nil
+ end
+ end
+
+ def []=(k,v)
+ if @values.size < @n
+ @values[k] = v
+ @marks[k] = @i
+ else
+ if @delete_stack.empty?
+ sweep
+ else
+ k2 = @delete_stack.pop
+ @values.delete k2
+ @marks.delete k2
+ self[k] = v
+ end
+ end
+ end
+
+ def sweep
+ @marks.each do |k,v|
+ v -= 1
+ if v == 0
+ @delete_stack.push k
+ @marks.delete k
+ else
+ @marks[k] = v
+ end
+ end
+ end
+end
diff --git a/lib/sup/person.rb b/lib/sup/person.rb
index c4f40a5..046eedc 100644
--- a/lib/sup/person.rb
+++ b/lib/sup/person.rb
@@ -1,3 +1,5 @@
+require 'sup/cache'
+
module Redwood
class Person
@@ -73,8 +75,11 @@ class Person
end.downcase
end
+ ## This method is expensive, so memoize it.
+ @from_address_cache = Cache.new
def self.from_address s
return nil if s.nil?
+ @from_address_cache[s].tap { |x| return x if x }
## try and parse an email address and name
name, email = case s
@@ -102,7 +107,7 @@ class Person
[nil, s]
end
- Person.new name, email
+ @from_address_cache[s] = Person.new name, email
end
def self.from_address_list ss
--
1.6.4
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] cache results of Person.from_address
2009-08-17 6:39 [sup-talk] [PATCH] cache results of Person.from_address Rich Lane
@ 2009-08-17 13:07 ` Andrei Thorp
2009-08-22 14:10 ` William Morgan
2009-08-22 14:10 ` William Morgan
1 sibling, 1 reply; 5+ messages in thread
From: Andrei Thorp @ 2009-08-17 13:07 UTC (permalink / raw)
Just want to say thanks again for your seemingly unending amount of good
work to improve Sup.
--
Andrei Thorp, Developer: Xandros Corp. (http://www.xandros.com)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] cache results of Person.from_address
2009-08-17 6:39 [sup-talk] [PATCH] cache results of Person.from_address Rich Lane
2009-08-17 13:07 ` Andrei Thorp
@ 2009-08-22 14:10 ` William Morgan
2009-08-22 18:28 ` Rich Lane
1 sibling, 1 reply; 5+ messages in thread
From: William Morgan @ 2009-08-22 14:10 UTC (permalink / raw)
This looks good. Two minor questions before I apply:
Reformatted excerpts from Rich Lane's message of 2009-08-16:
> The regexes in this function are very expensive, so caching improves
> performance significantly for queries and slightly for indexing.
When you say this affects query performance, is it just the contact-list
query, or is there some other mechanism by which this is slowing down
regular queries?
Also in this method:
> + def []=(k,v)
> + if @values.size < @n
> + @values[k] = v
> + @marks[k] = @i
> + else
> + if @delete_stack.empty?
> + sweep
> + else
> + k2 = @delete_stack.pop
> + @values.delete k2
> + @marks.delete k2
> + self[k] = v
> + end
> + end
> + end
Wouldn't it be better to do this?
else
if @delete_stack.empty?
sweep
end
unless @delete_stack.empty?
k2 = @delete_stack.pop
@values.delete k2
@marks.delete k2
self[k] = v
end
So we check the delete stack even after calling sweep, and we allow for
the value to be nil.
Thanks!
--
William <wmorgan-sup at masanjin.net>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] cache results of Person.from_address
2009-08-17 13:07 ` Andrei Thorp
@ 2009-08-22 14:10 ` William Morgan
0 siblings, 0 replies; 5+ messages in thread
From: William Morgan @ 2009-08-22 14:10 UTC (permalink / raw)
Reformatted excerpts from Andrei Thorp's message of 2009-08-17:
> Just want to say thanks again for your seemingly unending amount of
> good work to improve Sup.
Agreed!
--
William <wmorgan-sup at masanjin.net>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] cache results of Person.from_address
2009-08-22 14:10 ` William Morgan
@ 2009-08-22 18:28 ` Rich Lane
0 siblings, 0 replies; 5+ messages in thread
From: Rich Lane @ 2009-08-22 18:28 UTC (permalink / raw)
Excerpts from William Morgan's message of Sat Aug 22 10:10:04 -0400 2009:
> This looks good. Two minor questions before I apply:
>
> Reformatted excerpts from Rich Lane's message of 2009-08-16:
> > The regexes in this function are very expensive, so caching improves
> > performance significantly for queries and slightly for indexing.
>
> When you say this affects query performance, is it just the contact-list
> query, or is there some other mechanism by which this is slowing down
> regular queries?
Actually, your question prompted me to wonder why we're calling
Person.from_address on this path at all. With a little support from
Message we can completely avoid Message#parse_header. I've just sent in
a patch that does this. Please apply that rather than the from_address
cache.
The performance improvement from the new patch is slightly better than
that of the cache. Depending on the benchmark I see the time taken by
ThreadIndexMode#load_n_threads decrease by 1/2 to 2/3.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-08-22 18:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-17 6:39 [sup-talk] [PATCH] cache results of Person.from_address Rich Lane
2009-08-17 13:07 ` Andrei Thorp
2009-08-22 14:10 ` William Morgan
2009-08-22 14:10 ` William Morgan
2009-08-22 18:28 ` Rich Lane
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox