community/pipermail-archives/sup-devel/2012-02.txt (31752B) - raw
1 From michael+sup@stapelberg.de Mon Feb 6 15:45:56 2012
2 From: michael+sup@stapelberg.de (Michael Stapelberg)
3 Date: Mon, 06 Feb 2012 20:45:56 +0000
4 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
5 In-Reply-To: <1326155913-turnsole-58321@terminus-est>
6 References: <1326062741-sup-6469@stapelberg.de>
7 <1326155913-turnsole-58321@terminus-est>
8 Message-ID: <1328560984-sup-7904@stapelberg.de>
9
10 Hi William,
11
12 Excerpts from William Morgan's message of 2012-01-10 00:40:56 +0000:
13 > I need to write some code to do this. For IMAP and GMail, heliotrope-add will
14 > keep a pointer to thelast message imported, by default. For mbox there is a
15 > trick you can use. But there's nothing for maildir right now. Please file
16 > an issue so that I don't forget this!
17 It?s been four weeks, and https://github.com/wmorgan/heliotrope/issues/32 is
18 still open. Just to let you know: This is what?s stopping me from using
19 heliotrope+turnsole and possibly participating in its development at the moment
20 :-/.
21
22 If it?s a task which is too big for your current spare time, could you please
23 describe how you would implement it, so that others can help?
24
25 Best regards,
26 Michael
27
28 From matthieu.rakotojaona@gmail.com Mon Feb 6 18:08:33 2012
29 From: matthieu.rakotojaona@gmail.com (Matthieu Rakotojaona)
30 Date: Tue, 7 Feb 2012 00:08:33 +0100
31 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
32 In-Reply-To: <1328560984-sup-7904@stapelberg.de>
33 References: <1326062741-sup-6469@stapelberg.de>
34 <1326155913-turnsole-58321@terminus-est>
35 <1328560984-sup-7904@stapelberg.de>
36 Message-ID: <CAMiZLn1UjG=Mnn3WtPVZFNh59K6r=kAfR-9B_zz3H6oBwRVM4A@mail.gmail.com>
37
38 Hi,
39
40 On Mon, Feb 6, 2012 at 9:45 PM, Michael Stapelberg
41 <michael+sup at stapelberg.de> wrote:
42 > It?s been four weeks, and https://github.com/wmorgan/heliotrope/issues/32 is
43 > still open. Just to let you know: This is what?s stopping me from using
44 > heliotrope+turnsole and possibly participating in its development at the moment
45 > :-/.
46 >
47 > If it?s a task which is too big for your current spare time, could you please
48 > describe how you would implement it, so that others can help?
49
50 As far as I can tell, IMAP will not be implemented in heliotrope. I
51 have started some work on a IMAP frontend which you can fork from :
52
53 https://github.com/rakoo/imaptrope
54
55 You use it as a separate process. It will create a RESTClient
56 connected to heliotrope, and you just connect to it with classical
57 IMAP clients.
58
59 --
60 Matthieu RAKOTOJAONA
61
62 From michael+sup@stapelberg.de Mon Feb 6 18:46:17 2012
63 From: michael+sup@stapelberg.de (Michael Stapelberg)
64 Date: Mon, 06 Feb 2012 23:46:17 +0000
65 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
66 In-Reply-To: <CAMiZLn1UjG=Mnn3WtPVZFNh59K6r=kAfR-9B_zz3H6oBwRVM4A@mail.gmail.com>
67 References: <1326062741-sup-6469@stapelberg.de>
68 <1326155913-turnsole-58321@terminus-est>
69 <1328560984-sup-7904@stapelberg.de>
70 <CAMiZLn1UjG=Mnn3WtPVZFNh59K6r=kAfR-9B_zz3H6oBwRVM4A@mail.gmail.com>
71 Message-ID: <1328571942-sup-6289@stapelberg.de>
72
73 Hi Matthieu,
74
75 Excerpts from Matthieu Rakotojaona's message of 2012-02-06 23:08:33 +0000:
76 > As far as I can tell, IMAP will not be implemented in heliotrope. I
77 > have started some work on a IMAP frontend which you can fork from :
78 Uhm, that?s not what I am talking about. I want to fetch mails from an IMAP
79 server and store them in heliotrope.
80
81 Best regards,
82 Michael
83
84 From matthieu.rakotojaona@gmail.com Tue Feb 7 07:27:32 2012
85 From: matthieu.rakotojaona@gmail.com (Matthieu Rakotojaona)
86 Date: Tue, 7 Feb 2012 13:27:32 +0100
87 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
88 In-Reply-To: <1328571942-sup-6289@stapelberg.de>
89 References: <1326062741-sup-6469@stapelberg.de>
90 <1326155913-turnsole-58321@terminus-est>
91 <1328560984-sup-7904@stapelberg.de>
92 <CAMiZLn1UjG=Mnn3WtPVZFNh59K6r=kAfR-9B_zz3H6oBwRVM4A@mail.gmail.com>
93 <1328571942-sup-6289@stapelberg.de>
94 Message-ID: <CAMiZLn17XrTzJswryKM-TJqu6RabvLEXpRzWzo9jY5z68g--Ng@mail.gmail.com>
95
96 On Tue, Feb 7, 2012 at 12:46 AM, Michael Stapelberg
97 <michael+sup at stapelberg.de> wrote:
98 > Uhm, that?s not what I am talking about. I want to fetch mails from an IMAP
99 > server and store them in heliotrope.
100
101 Yes, you can use offlineimap to sync 2 IMAP servers, although this is
102 still a work in progress. Or you can use 1-way copy tools, such as
103 imapcopy[0].
104
105 Oh, and heliotrope does tell you when it already has an email, based
106 on its Message-Id header.
107
108 [0] imapsync.lamiral.info/
109
110
111 --
112 Matthieu RAKOTOJAONA
113
114 From michael+sup@stapelberg.de Tue Feb 7 14:11:00 2012
115 From: michael+sup@stapelberg.de (Michael Stapelberg)
116 Date: Tue, 07 Feb 2012 19:11:00 +0000
117 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
118 In-Reply-To: <CAMiZLn17XrTzJswryKM-TJqu6RabvLEXpRzWzo9jY5z68g--Ng@mail.gmail.com>
119 References: <1326062741-sup-6469@stapelberg.de>
120 <1326155913-turnsole-58321@terminus-est>
121 <1328560984-sup-7904@stapelberg.de>
122 <CAMiZLn1UjG=Mnn3WtPVZFNh59K6r=kAfR-9B_zz3H6oBwRVM4A@mail.gmail.com>
123 <1328571942-sup-6289@stapelberg.de>
124 <CAMiZLn17XrTzJswryKM-TJqu6RabvLEXpRzWzo9jY5z68g--Ng@mail.gmail.com>
125 Message-ID: <1328641741-sup-6676@stapelberg.de>
126
127 Hi Matthieu,
128
129 Excerpts from Matthieu Rakotojaona's message of 2012-02-07 12:27:32 +0000:
130 > Yes, you can use offlineimap to sync 2 IMAP servers, although this is
131 > still a work in progress. Or you can use 1-way copy tools, such as
132 > imapcopy[0].
133 I use offlineimap to sync my IMAP server at home with my notebook (maildir) and
134 that works fine.
135
136 > Oh, and heliotrope does tell you when it already has an email, based
137 > on its Message-Id header.
138 Yes, but heliotrope-add does a full scan on every run, which takes hours
139 (instead of seconds).
140
141 I explained this in my original email and William acknowledged that code for
142 this is just missing. I?m not sure what you are trying to say.
143
144 Best regards,
145 Michael
146
147 From matthieu.rakotojaona@gmail.com Tue Feb 7 16:54:08 2012
148 From: matthieu.rakotojaona@gmail.com (Matthieu Rakotojaona)
149 Date: Tue, 7 Feb 2012 22:54:08 +0100
150 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
151 In-Reply-To: <1328641741-sup-6676@stapelberg.de>
152 References: <1326062741-sup-6469@stapelberg.de>
153 <1326155913-turnsole-58321@terminus-est>
154 <1328560984-sup-7904@stapelberg.de>
155 <CAMiZLn1UjG=Mnn3WtPVZFNh59K6r=kAfR-9B_zz3H6oBwRVM4A@mail.gmail.com>
156 <1328571942-sup-6289@stapelberg.de>
157 <CAMiZLn17XrTzJswryKM-TJqu6RabvLEXpRzWzo9jY5z68g--Ng@mail.gmail.com>
158 <1328641741-sup-6676@stapelberg.de>
159 Message-ID: <CAMiZLn1au8xY9XHEdnBQLnTECfTxp+cshCWwfYm43abz7jhBTA@mail.gmail.com>
160
161 Hi Michael,
162
163 On Tue, Feb 7, 2012 at 8:11 PM, Michael Stapelberg
164 <michael+sup at stapelberg.de> wrote:
165 > I?m not sure what you are trying to say.
166
167 Ok, I think I know what you mean.
168
169 You currently have a main IMAP server. You regularly sync it to your
170 notebook in maildir with offlineimap. You would like to use this
171 maildir directly with heliotrope. Right ?
172
173 So here are two options I can see for you:
174 * keep the maildir-backuping in a corner and sync your main IMAP
175 server directly with imaptrope ("heliotrope's" IMAP interface). This
176 would need two runs of offlineIMAP.
177 * use offlineimap to sync your maildir to imaptrope. This might create
178 problems, because the 2 syncings might interfere with each other.
179
180
181
182
183 --
184 Matthieu RAKOTOJAONA
185
186 From wmorgan-sup@masanjin.net Mon Feb 13 01:16:33 2012
187 From: wmorgan-sup@masanjin.net (William Morgan)
188 Date: Sun, 12 Feb 2012 22:16:33 -0800
189 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
190 In-Reply-To: <1326062741-sup-6469@stapelberg.de>
191 References: <1326062741-sup-6469@stapelberg.de>
192 Message-ID: <1329113568-sup-8177@typhon>
193
194 Excerpts from Michael Stapelberg's message of 2012-01-08 14:48:18 -0800:
195 > What is the correct way to do this?
196
197 Please try the latest heliotrope master. You can do a full import with this:
198
199 ruby -Ilib bin/heliotrope-import -a <maildir> -d <mailstore> -t <state file>
200
201 where <state file> is just a place for it to dump its state. This import need
202 only happen once. Then for subsequent additions you can do this:
203
204 ruby -Ilib bin/heliotrope-add -a <maildir> -d <mailstore> -t <state file>
205
206 which will read to and write from that state file, and only add what it
207 hasn't yet seen.
208 --
209 William <wmorgan at masanjin.net>
210
211 From michael+sup@stapelberg.de Wed Feb 22 21:41:42 2012
212 From: michael+sup@stapelberg.de (Michael Stapelberg)
213 Date: Wed, 22 Feb 2012 22:41:42 +0100
214 Subject: [sup-devel] [Turnsole] Automatically displaying new mails?
215 Message-ID: <1329946808-sup-8894@stapelberg.de>
216
217 Hi,
218
219 I am trying to use Turnsole for my mails, but I noticed that it never seems to
220 pick up new mails until I hit refresh (@) manually. Is that intended or am I
221 doing it wrong?
222
223 It seems to me that leaving an HTTP connection open and having the server send
224 chunks on it when it receives new messages is a good way of doing this.
225
226 Best regards,
227 Michael
228
229 From michael+sup@stapelberg.de Wed Feb 22 21:39:42 2012
230 From: michael+sup@stapelberg.de (Michael Stapelberg)
231 Date: Wed, 22 Feb 2012 22:39:42 +0100
232 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
233 In-Reply-To: <1329113568-sup-8177@typhon>
234 References: <1326062741-sup-6469@stapelberg.de> <1329113568-sup-8177@typhon>
235 Message-ID: <1329946381-sup-2255@stapelberg.de>
236
237 Hi William,
238
239 Thanks for adding code for this. I got the chance to test this today, and I?m
240 afraid this doesn?t seem to be working as I would expect.
241
242 As far as I can tell from glancing at the source and the state file, this code
243 examines the date stored in every mail, sorts that, then checks where it
244 stopped before, and continues from there. I think this is a horrible way of
245 doing things for two reasons:
246
247 1) It involves opening every single email on disk.
248 2) Date specifications in emails are not to be trusted (think spam).
249
250 For the time being, I let procmail save my messages in my IMAP server and in
251 heliotrope with a little script, which I include here for others to see:
252
253 #!/bin/bash
254 # Delivers mail via dovecot and heliotrope
255
256 # load RVM
257 . /home/michael/.rvm/scripts/rvm
258
259 TMPFILE=$(mktemp)
260
261 cat > $TMPFILE
262
263 cat $TMPFILE | sudo -u mail -- /usr/lib/dovecot/deliver -d michael
264 cat $TMPFILE | ruby -I/home/michael/heliotrope/lib /home/michael/heliotrope/bin/heliotrope-add >>/tmp/last-mail.stdout 2>>/tmp/last-mail.stderr
265 rm $TMPFILE
266
267 After ensuring that it works, you should get rid of the stdout/stderr redirects
268 or redirect them properly to a place with logfile rotation, of course.
269
270 Note that adding mails via stdin doesn?t work until you change the code like this:
271 https://github.com/wmorgan/heliotrope/pull/34
272
273 Best regards,
274 Michael
275
276 From michael+sup@stapelberg.de Wed Feb 22 21:57:35 2012
277 From: michael+sup@stapelberg.de (Michael Stapelberg)
278 Date: Wed, 22 Feb 2012 22:57:35 +0100
279 Subject: [sup-devel] [Heliotrope] Equivalent of archived sources
280 Message-ID: <1329947811-sup-7920@stapelberg.de>
281
282 Hi,
283
284 how would I get the same behavior as in sup when configuring a source to be
285 archived by default (a mailing list which I receive for reference, not for
286 actually reading it all the time)?
287
288 Best regards,
289 Michael
290
291 From michael+sup@stapelberg.de Wed Feb 22 22:00:27 2012
292 From: michael+sup@stapelberg.de (Michael Stapelberg)
293 Date: Wed, 22 Feb 2012 23:00:27 +0100
294 Subject: [sup-devel] [Heliotrope] What exactly does reordering do? Is it
295 still necessary?
296 Message-ID: <1329947900-sup-5773@stapelberg.de>
297
298 Hi,
299
300 I just realized that I forgot to import the maildir in which I store my sent
301 messages. After adding them to heliotrope, do I have to reorder the index? What
302 exactly does it do and why is that necessary?
303
304 I?m a bit confused because commit
305 https://github.com/wmorgan/heliotrope/commit/f7bfda9dd83db1b9cd2a51ba2599da81fc1b87c1
306 talks about "reindex without --reorder".
307
308 Best regards,
309 Michael
310
311 PS: The real time it takes to reorder my index is about 1 hour and 30 minutes :(
312
313 From triumhiz@yandex.ru Thu Feb 23 05:19:36 2012
314 From: triumhiz@yandex.ru (Serge Z)
315 Date: Thu, 23 Feb 2012 09:19:36 +0400
316 Subject: [sup-devel] [Heliotrope] [Turnsole] ebuilds
317 Message-ID: <20120223051936.2123.72633@localhost>
318
319
320 Hi!
321
322 Does anyone has ebuilds for heliotrope, turnsole or any related libraries?
323 Tried to compose my own, but didn't manage to.
324
325 From wmorgan-sup@masanjin.net Thu Feb 23 06:34:29 2012
326 From: wmorgan-sup@masanjin.net (William Morgan)
327 Date: Wed, 22 Feb 2012 22:34:29 -0800
328 Subject: [sup-devel] [Heliotrope] Equivalent of archived sources
329 In-Reply-To: <1329947811-sup-7920@stapelberg.de>
330 References: <1329947811-sup-7920@stapelberg.de>
331 Message-ID: <1329978718-turnsole-71233@terminus-est>
332
333 Excerpts from Michael Stapelberg's message of 2012-02-22 13:57:35 -0800:
334 > how would I get the same behavior as in sup when configuring a source to be
335 > archived by default (a mailing list which I receive for reference, not for
336 > actually reading it all the time)?
337
338 I think the best way to do this would be to add some code to
339 heliotrope-import that would allow overridding of labels on the
340 commandline. Maybe something like --add-labels and --remove-labels
341 options, that would modify whatever labels the source provides.
342 Feel free to make an issue to track this.
343
344 --
345 William <wmorgan at masanjin.net>
346
347 From wmorgan-sup@masanjin.net Thu Feb 23 06:39:47 2012
348 From: wmorgan-sup@masanjin.net (William Morgan)
349 Date: Wed, 22 Feb 2012 22:39:47 -0800
350 Subject: [sup-devel] [Heliotrope] What exactly does reordering do? Is it
351 still necessary?
352 In-Reply-To: <1329947900-sup-5773@stapelberg.de>
353 References: <1329947900-sup-5773@stapelberg.de>
354 Message-ID: <1329978886-turnsole-20550@terminus-est>
355
356 Excerpts from Michael Stapelberg's message of 2012-02-22 14:00:27 -0800:
357 > I just realized that I forgot to import the maildir in which I store
358 > my sent messages. After adding them to heliotrope, do I have to
359 > reorder the index?
360
361 Probably yes.
362
363 > What exactly does it do and why is that necessary?
364
365 Heliotrope only ever returns messages in the order in which they were
366 indexed. So if you import mail from different sources, one after the
367 other, unless those sources happen to be in chronological order, you
368 will have to reorder.
369
370 It's a little hard to see that this is happening because Turnsole
371 reorders the items it displays (for complicated reasons), but you'll
372 probably start to notice it as you paginate.
373
374 > I?m a bit confused because commit
375 > https://github.com/wmorgan/heliotrope/commit/f7bfda9dd83db1b9cd2a51ba2599da81fc1b87c1
376 > talks about "reindex without --reorder".
377
378 If the index ever becomes corrupted, it's nice to be able to rebuild it.
379 That's the only real reason to do that. (Or maybe the index format
380 changes in later releases, etc.)
381
382 > PS: The real time it takes to reorder my index is about 1 hour and 30
383 > minutes :(
384
385 Sorry. But the point of indexing is to frontload all the work, so that
386 import time (which has to only happen once) is slow, but search time is
387 fast.
388
389 --
390 William <wmorgan at masanjin.net>
391
392 From wmorgan-sup@masanjin.net Thu Feb 23 06:26:44 2012
393 From: wmorgan-sup@masanjin.net (William Morgan)
394 Date: Wed, 22 Feb 2012 22:26:44 -0800
395 Subject: [sup-devel] [Turnsole] Automatically displaying new mails?
396 In-Reply-To: <1329946808-sup-8894@stapelberg.de>
397 References: <1329946808-sup-8894@stapelberg.de>
398 Message-ID: <1329978326-turnsole-76268@terminus-est>
399
400 Excerpts from Michael Stapelberg's message of 2012-02-22 13:41:42 -0800:
401 > I am trying to use Turnsole for my mails, but I noticed that it never
402 > seems to pick up new mails until I hit refresh (@) manually. Is that
403 > intended or am I doing it wrong?
404
405 You are correct. I haven't added this yet. Manual refresh is your only
406 option. It shouldn't be too hard to add a simple polling mechanism.
407 Feel free to make an issue for this.
408
409 --
410 William <wmorgan at masanjin.net>
411
412 From wmorgan-sup@masanjin.net Thu Feb 23 06:25:11 2012
413 From: wmorgan-sup@masanjin.net (William Morgan)
414 Date: Wed, 22 Feb 2012 22:25:11 -0800
415 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
416 In-Reply-To: <1329946381-sup-2255@stapelberg.de>
417 References: <1326062741-sup-6469@stapelberg.de> <1329113568-sup-8177@typhon>
418 <1329946381-sup-2255@stapelberg.de>
419 Message-ID: <1329977859-turnsole-79917@terminus-est>
420
421 Excerpts from Michael Stapelberg's message of 2012-02-22 13:39:42 -0800:
422 > Thanks for adding code for this. I got the chance to test this today,
423 > and I?m afraid this doesn?t seem to be working as I would expect.
424 >
425 > As far as I can tell from glancing at the source and the state file, this code
426 > examines the date stored in every mail, sorts that, then checks where it
427 > stopped before, and continues from there. I think this is a horrible way of
428 > doing things for two reasons:
429 >
430 > 1) It involves opening every single email on disk.
431 > 2) Date specifications in emails are not to be trusted (think spam).
432
433 My understanding of the Maildir specification is that there is no way to
434 determine the order of messages besides reading the date headers.
435 Ordering messages correctly at import time is important because
436 Heliotrope only serves messages in the reverse order to which they were
437 imported. If there is a better solution, please feel free to educate me.
438
439 Is there any problem with the current code besides the fact that you
440 don't like the big scan?
441
442 > Note that adding mails via stdin doesn?t work until you change the code like this:
443 > https://github.com/wmorgan/heliotrope/pull/34
444
445 Merged, thank you.
446
447 --
448 William <wmorgan at masanjin.net>
449
450 From wirtwolff@gmail.com Thu Feb 23 06:38:27 2012
451 From: wirtwolff@gmail.com (Wirt Wolff)
452 Date: Wed, 22 Feb 2012 23:38:27 -0700
453 Subject: [sup-devel] [Heliotrope] [Turnsole] ebuilds
454 In-Reply-To: <20120223051936.2123.72633@localhost>
455 References: <20120223051936.2123.72633@localhost>
456 Message-ID: <1329978847-sup-4921@chigamba>
457
458 Excerpts from Serge Z's message of Wed Feb 22 22:19:36 -0700 2012:
459 >
460 > Does anyone has ebuilds for heliotrope, turnsole or any related libraries?
461 > Tried to compose my own, but didn't manage to.
462
463 I found ruby support on gentoo to be a bit of a nightmare. I would use
464 rvm [1] to install instead, as user rather than globally if possible.
465
466 [1] https://rvm.beginrescueend.com/
467
468 Regards,
469
470 W
471
472 From michael+sup@stapelberg.de Thu Feb 23 11:24:22 2012
473 From: michael+sup@stapelberg.de (Michael Stapelberg)
474 Date: Thu, 23 Feb 2012 12:24:22 +0100
475 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
476 In-Reply-To: <1329977859-turnsole-79917@terminus-est>
477 References: <1326062741-sup-6469@stapelberg.de> <1329113568-sup-8177@typhon>
478 <1329946381-sup-2255@stapelberg.de>
479 <1329977859-turnsole-79917@terminus-est>
480 Message-ID: <1329996040-sup-2419@stapelberg.de>
481
482 Hi William,
483
484 Excerpts from William Morgan's message of 2012-02-23 07:25:11 +0100:
485 > My understanding of the Maildir specification is that there is no way to
486 > determine the order of messages besides reading the date headers.
487 > Ordering messages correctly at import time is important because
488 > Heliotrope only serves messages in the reverse order to which they were
489 > imported. If there is a better solution, please feel free to educate me.
490 Well, I guess you are correct. So, when somebody sends me a spam email or a
491 malicious email with a faked date, what happens? I think that the code will
492 figure out it needs to re-add a lot of messages. Also, my index will need to be
493 reordered, right?
494
495 Also, what happens when somebody sends me a message with a faked date and I add
496 it using heliotrope-add? Do I need to reorder my index?
497
498 > Is there any problem with the current code besides the fact that you
499 > don't like the big scan?
500 That makes it sound like it?s just a matter of opinion. My concern is what I
501 stated above (huge processing load / disk IO caused by malicious messages) and
502 that it thrashes my poor disks.
503
504 Best regards,
505 Michael
506
507 From triumhiz@yandex.ru Fri Feb 24 05:19:07 2012
508 From: triumhiz@yandex.ru (Serge Z)
509 Date: Fri, 24 Feb 2012 09:19:07 +0400
510 Subject: [sup-devel] [Heliotrope] why whistlepig?
511 Message-ID: <20120224051907.5668.78469@localhost>
512
513
514 Hi,
515
516 subj.
517 Why not xapian?
518
519 Thanks
520
521 From triumhiz@yandex.ru Fri Feb 24 05:43:18 2012
522 From: triumhiz@yandex.ru (Serge Z)
523 Date: Fri, 24 Feb 2012 09:43:18 +0400
524 Subject: [sup-devel] [Heliotrope] search for cyrillic terms does not work
525 Message-ID: <20120224054318.8144.49786@localhost>
526
527
528 Hi!
529 I've submitted issue #36 at github.com on this problem.
530
531 > Search for cyrillic terms does not work both in web-interface and turnsole.
532 >
533 > The search term is present in utf-8 and windows-1251 encodings in different mails.
534 > My system encoding is utf-8.
535 >
536 > Ruby 1.9
537 >
538 > Please inform if this is not an issue with ruby 1.8 or any other environment
539
540 Can you please suppose why that could happen or offer a simple workaround?
541 This issue is the only one which stops me from migrating to Heliotrope/Turnsole.
542
543 I'd not got this problem with sup.
544
545 From wmorgan-sup@masanjin.net Sat Feb 25 06:35:32 2012
546 From: wmorgan-sup@masanjin.net (William Morgan)
547 Date: Fri, 24 Feb 2012 22:35:32 -0800
548 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
549 In-Reply-To: <1329996040-sup-2419@stapelberg.de>
550 References: <1326062741-sup-6469@stapelberg.de> <1329113568-sup-8177@typhon>
551 <1329946381-sup-2255@stapelberg.de>
552 <1329977859-turnsole-79917@terminus-est>
553 <1329996040-sup-2419@stapelberg.de>
554 Message-ID: <1330150790-turnsole-3005@terminus-est>
555
556 Excerpts from Michael Stapelberg's message of 2012-02-23 03:24:22 -0800:
557 > Well, I guess you are correct. So, when somebody sends me a spam email
558 > or a malicious email with a faked date, what happens? I think that the
559 > code will figure out it needs to re-add a lot of messages. Also, my
560 > index will need to be reordered, right?
561
562 You won't need to reorder. Importing is all incremental, at least if you
563 use --state-file: it will keep track of the last message imported, and
564 successive imports will only pick up new messages.
565
566 But you are making me realize that the current maildir implementation is
567 not right. I think the way to handle Maildir is to look only at the
568 ctime of the files and not the date headers. That will make incremental
569 importing possible (just save the filename of the last imported file,
570 and look for all files newer than that). It will speed things up
571 anyways.
572
573 And if you're in the funny situation where the ctime is not correlated
574 with the Date: headers in your files, which is perfectly possibly by the
575 Maildir spec, then you will have to reorder after that initial import.
576 But in reality that's a possibility with the other source types
577 too---it's just more likely with Maildir.
578
579 --
580 William <wmorgan at masanjin.net>
581
582 From wmorgan-sup@masanjin.net Sat Feb 25 06:39:50 2012
583 From: wmorgan-sup@masanjin.net (William Morgan)
584 Date: Fri, 24 Feb 2012 22:39:50 -0800
585 Subject: [sup-devel] [Heliotrope] why whistlepig?
586 In-Reply-To: <20120224051907.5668.78469@localhost>
587 References: <20120224051907.5668.78469@localhost>
588 Message-ID: <1330151741-turnsole-57719@terminus-est>
589
590 Excerpts from Serge Z's message of 2012-02-23 21:19:07 -0800:
591 > Why not xapian?
592
593 At least as of a year ago when I looked at Xapian, the incremental
594 indexing was painful (you had to keep a separate smaller corpus of
595 additions, reindex it every time a new document was added, and merge
596 when it got too big), doing anything outside of indexing MySQL rows was
597 extremely painful (you had to use some weird XML format that I could
598 never get quite right), and there was no way to have mutable arbitrary
599 labels.
600
601 I also prefer to keep everything in a single process, but that's
602 not a hard requirement.
603
604 --
605 William <wmorgan at masanjin.net>
606
607 From wmorgan-sup@masanjin.net Sat Feb 25 06:42:57 2012
608 From: wmorgan-sup@masanjin.net (William Morgan)
609 Date: Fri, 24 Feb 2012 22:42:57 -0800
610 Subject: [sup-devel] [Heliotrope] search for cyrillic terms does not work
611 In-Reply-To: <20120224054318.8144.49786@localhost>
612 References: <20120224054318.8144.49786@localhost>
613 Message-ID: <1330152007-turnsole-69593@terminus-est>
614
615 Excerpts from Serge Z's message of 2012-02-23 21:43:18 -0800:
616 > Can you please suppose why that could happen or offer a simple
617 > workaround? This issue is the only one which stops me from migrating
618 > to Heliotrope/Turnsole.
619
620 I'll have to look into it. Super-ascii search was definitely working at
621 some point, so I'm not sure why there was a regression. You can help by
622 trying to see whether the problem is in whistlepig or in heliotrope,
623 e.g. by adding and querying Cyrillic documents in a whistlepig index of
624 your own construction.
625
626 --
627 William <wmorgan at masanjin.net>
628
629 From np@nicolaspouillard.fr Sun Feb 26 15:37:28 2012
630 From: np@nicolaspouillard.fr (Nicolas Pouillard)
631 Date: Sun, 26 Feb 2012 16:37:28 +0100
632 Subject: [sup-devel] [Heliotrope] Equivalent of archived sources
633 In-Reply-To: <1329978718-turnsole-71233@terminus-est>
634 References: <1329947811-sup-7920@stapelberg.de>
635 <1329978718-turnsole-71233@terminus-est>
636 Message-ID: <1330270255-turnsole-17000@ss>
637
638 Excerpts from William Morgan's message of 2012-02-23 07:34:29 +0100:
639 > Excerpts from Michael Stapelberg's message of 2012-02-22 13:57:35 -0800:
640 > > how would I get the same behavior as in sup when configuring a source to be
641 > > archived by default (a mailing list which I receive for reference, not for
642 > > actually reading it all the time)?
643 >
644 > I think the best way to do this would be to add some code to
645 > heliotrope-import that would allow overridding of labels on the
646 > commandline. Maybe something like --add-labels and --remove-labels
647 > options, that would modify whatever labels the source provides.
648 > Feel free to make an issue to track this.
649
650 This would indeed be a really good idea. In particular to import folder
651 by folder and set a label at the same time.
652
653 (on of my first mails sent through turnsole)
654
655 From michael+sup@stapelberg.de Mon Feb 27 09:16:43 2012
656 From: michael+sup@stapelberg.de (Michael Stapelberg)
657 Date: Mon, 27 Feb 2012 10:16:43 +0100
658 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
659 In-Reply-To: <1330150790-turnsole-3005@terminus-est>
660 References: <1326062741-sup-6469@stapelberg.de> <1329113568-sup-8177@typhon>
661 <1329946381-sup-2255@stapelberg.de>
662 <1329977859-turnsole-79917@terminus-est>
663 <1329996040-sup-2419@stapelberg.de>
664 <1330150790-turnsole-3005@terminus-est>
665 Message-ID: <1330333805-sup-1287@stapelberg.de>
666
667 Hi William,
668
669 Sorry for clinging to that topic, but it?s important for me to properly
670 understand it.
671
672 Excerpts from William Morgan's message of 2012-02-25 07:35:32 +0100:
673 > Excerpts from Michael Stapelberg's message of 2012-02-23 03:24:22 -0800:
674 > > Well, I guess you are correct. So, when somebody sends me a spam email
675 > > or a malicious email with a faked date, what happens? I think that the
676 > > code will figure out it needs to re-add a lot of messages. Also, my
677 > > index will need to be reordered, right?
678 >
679 > You won't need to reorder. Importing is all incremental, at least if you
680 > use --state-file: it will keep track of the last message imported, and
681 > successive imports will only pick up new messages.
682 Apparently, I didn?t properly describe the scenario I?m thinking about:
683
684 state-file contains an entry "/my/maildir/2012-02-22-foo-bar", which is an
685 email that contains 2012-02-22 as date.
686
687 Now, a spammer sends me an email with a faked date, let?s say 2001-01-01, let?s
688 call it "/my/maildir/2001-01-01-spam-mail". The next run of heliotrope-add has
689 two possibilities:
690
691 1) It will completely ignore it. I think this is what will happen, based on
692 your description. This is horrible! Emails are absolutely not guaranteed to
693 arrive in my system in the order they were sent. Is this the case? Do we
694 ignore email because of date problems like this currently?
695
696 2) It will pick up this email because the date inside is older. Subsequent runs
697 will discover all email from the last 10 years as new, and try to re-add it
698 to the index. Lots of unnecessary overhead.
699
700 > But you are making me realize that the current maildir implementation is
701 > not right. I think the way to handle Maildir is to look only at the
702 > ctime of the files and not the date headers. That will make incremental
703 > importing possible (just save the filename of the last imported file,
704 > and look for all files newer than that). It will speed things up
705 > anyways.
706 Right, that sounds good.
707
708 > And if you're in the funny situation where the ctime is not correlated
709 > with the Date: headers in your files, which is perfectly possibly by the
710 > Maildir spec, then you will have to reorder after that initial import.
711 > But in reality that's a possibility with the other source types
712 > too---it's just more likely with Maildir.
713 So, let?s stick to my above example of the spam email and let?s assume that you
714 changed the code to use ctimes. The spam mail arrives, its ctime is new, it
715 gets picked up into my index. To my understanding, I now need to reorder. I
716 have multiple questions:
717
718 1) What are the immediate consequences? Where will this email appear in my
719 inbox, when my inbox contains 5 emails. Always at the top? Bottom? You
720 mentioned I can see the effect when paging. Can you elaborate please?
721
722 2) How do I know when I have to reorder?
723
724 Best regards,
725 Michael
726
727 From wmorgan-sup@masanjin.net Wed Feb 29 01:11:56 2012
728 From: wmorgan-sup@masanjin.net (William Morgan)
729 Date: Tue, 28 Feb 2012 17:11:56 -0800
730 Subject: [sup-devel] [Heliotrope/Turnsole] How to use IMAP?
731 In-Reply-To: <1330333805-sup-1287@stapelberg.de>
732 References: <1326062741-sup-6469@stapelberg.de> <1329113568-sup-8177@typhon>
733 <1329946381-sup-2255@stapelberg.de>
734 <1329977859-turnsole-79917@terminus-est>
735 <1329996040-sup-2419@stapelberg.de>
736 <1330150790-turnsole-3005@terminus-est>
737 <1330333805-sup-1287@stapelberg.de>
738 Message-ID: <1330468308-turnsole-43856@terminus-est>
739
740 Excerpts from Michael Stapelberg's message of 2012-02-27 01:16:43 -0800:
741 > 1) It will completely ignore it.
742
743 Correct. This is the problem that the ctime solution is meant to address.
744
745 > So, let?s stick to my above example of the spam email and let?s assume
746 > that you changed the code to use ctimes. The spam mail arrives, its
747 > ctime is new, it gets picked up into my index. To my understanding, I
748 > now need to reorder. I have multiple questions:
749 >
750 > 1) What are the immediate consequences? Where will this email appear in my
751 > inbox, when my inbox contains 5 emails. Always at the top? Bottom? You
752 > mentioned I can see the effect when paging. Can you elaborate please?
753
754 Heliotrope will return it as the first result, since it returns messages
755 in LIFO order.
756
757 Currently Turnsole currently re-sorts all messages returned by sort by
758 date, and so it will place it at the bottom. This is also where the
759 pagination issues may come from. Maybe it would be less confusing for it
760 to stick to search result order.
761
762 > 2) How do I know when I have to reorder?
763
764 You have to reorder whenever you want Heliotrope to return messages in
765 Date header order instead of in LIFO order. I think this will typically
766 happen when you import a couple batches of preexisting mail, and not
767 when you're adding new mail.
768
769 --
770 William <wmorgan at masanjin.net>
771
772 From triumhiz@yandex.ru Wed Feb 29 10:44:13 2012
773 From: triumhiz@yandex.ru (Serge Z)
774 Date: Wed, 29 Feb 2012 14:44:13 +0400
775 Subject: [sup-devel] [Heliotrope] search for cyrillic terms does not work
776 In-Reply-To: <1330152007-turnsole-69593@terminus-est>
777 References: <20120224054318.8144.49786@localhost>
778 <1330152007-turnsole-69593@terminus-est>
779 Message-ID: <20120229104413.1197.73225@localhost>
780
781
782 Quoting William Morgan (2012-02-25 10:42:57)
783 >I'll have to look into it. Super-ascii search was definitely working at
784 >some point, so I'm not sure why there was a regression. You can help by
785 >trying to see whether the problem is in whistlepig or in heliotrope,
786 >e.g. by adding and querying Cyrillic documents in a whistlepig index of
787 >your own construction.
788
789 Does whistlepig have any tool for manual managing its indexes?
790 Or do I have to code in either ruby or C to do that?
791