Discussion:
[Sks-devel] PTree may be corrupted kills recon service
André Keller
2018-07-11 15:38:57 UTC
Permalink
Hi,

for a few days I have an issue with the recon process on
keys.communityrack.org:

2018-07-02 15:17:53 Raising Sys.Break -- PTree may be corrupted:
Failure("remove_from_node: attempt to delete non-existant element from
prefix tree")
2018-07-02 15:17:53 DB closed

2018-07-10 11:06:09 Raising Sys.Break -- PTree may be corrupted:
Failure("remove_from_node: attempt to delete non-existant element from
prefix tree")
2018-07-10 11:06:10 DB closed


between these two occurrences, I rebuilt the PTree database using the
following command:

/usr/sbin/sks pbuild -cache 20 -ptree_cache 70

After rebuilding the recon process started to catch-up on keys but after
a few hours it crashed with the same message. Is there a way to recover
from this? Or do I need to start from scratch with an up-to-date keydump?


Regards

André
John Zaitseff
2018-07-11 18:40:30 UTC
Permalink
Hi,
Post by André Keller
for a few days I have an issue with the recon process on
Failure("remove_from_node: attempt to delete non-existant element
from prefix tree")
2018-07-02 15:17:53 DB closed
I saw the same thing happen. I stopped SKS, dumped my existing keys
to the dump directory ("/usr/sbin/sks dump 32768 /var/lib/sks/dump"),
tweaked the /etc/sks/sksconf file to include "pagesize: 32" and
"ptree_pagesize: 16", removed the DB and PTree directories, then
rebuilt both:

/usr/sbin/sks build /var/lib/sks/dump/*.pgp -n 1 -cache 100
/usr/sbin/sks cleandb
/usr/sbin/sks pbuild -cache 50 -ptree_cache 100

SKS restarted fine; so far so good! I'll be keeping an eye on it
over the next few days, so I'll report back as needed.

I'm just wondering whether someone has found yet another way to take
down SKS servers worldwide. It's a bit disappointing that the SKS
keyserver source code available on bitbucket.org has not been
touched in over a year... is anyone actually working on it?

Yours truly,

John Zaitseff
--
John Zaitseff ,--_|\ The ZAP Group
Phone: +61 2 9643 7737 / \ Sydney, Australia
E-mail: ***@zap.org.au \_,--._* http://www.zap.org.au/
v
André Keller
2018-07-12 15:55:20 UTC
Permalink
Hi John,
Post by John Zaitseff
Post by André Keller
for a few days I have an issue with the recon process on
Failure("remove_from_node: attempt to delete non-existant element
from prefix tree")
2018-07-02 15:17:53 DB closed
I saw the same thing happen. I stopped SKS, dumped my existing keys
to the dump directory ("/usr/sbin/sks dump 32768 /var/lib/sks/dump"),
tweaked the /etc/sks/sksconf file to include "pagesize: 32" and
"ptree_pagesize: 16", removed the DB and PTree directories, then
/usr/sbin/sks build /var/lib/sks/dump/*.pgp -n 1 -cache 100
/usr/sbin/sks cleandb
/usr/sbin/sks pbuild -cache 50 -ptree_cache 100
SKS restarted fine; so far so good! I'll be keeping an eye on it
over the next few days, so I'll report back as needed.
thank you for your reply, I have done that as-well and it is running
stable now since a few hours. Let's see for how long :-)

Regards
André
Michael Jones
2018-07-13 13:09:45 UTC
Permalink
Hi,

I was away on work, came back to one of nodes I host ran out of disk space.

Users of the service would not have been effected as this node doesn't
serve web traffic other than the default key stats page.

And keys would have continued to sync via a backup node.

Node is back in and fixed.

(this is for the service at sks.mj2.uk)

Kind Regards,
Mike
André Keller
2018-07-17 08:52:18 UTC
Permalink
Hi all,
Post by André Keller
Post by John Zaitseff
Post by André Keller
for a few days I have an issue with the recon process on
Failure("remove_from_node: attempt to delete non-existant element
from prefix tree")
2018-07-02 15:17:53 DB closed
I saw the same thing happen. I stopped SKS, dumped my existing keys
to the dump directory ("/usr/sbin/sks dump 32768 /var/lib/sks/dump"),
tweaked the /etc/sks/sksconf file to include "pagesize: 32" and
"ptree_pagesize: 16", removed the DB and PTree directories, then
/usr/sbin/sks build /var/lib/sks/dump/*.pgp -n 1 -cache 100
/usr/sbin/sks cleandb
/usr/sbin/sks pbuild -cache 50 -ptree_cache 100
SKS restarted fine; so far so good! I'll be keeping an eye on it
over the next few days, so I'll report back as needed.
thank you for your reply, I have done that as-well and it is running
stable now since a few hours. Let's see for how long :-)
Unfortunately the issues is still not resolved. Is nobody else
experiencing this?
John Zaitseff
2018-07-17 08:57:23 UTC
Permalink
Hi,
Post by André Keller
[...]
Post by John Zaitseff
Post by André Keller
2018-07-02 15:17:53 Raising Sys.Break -- PTree may be
corrupted: Failure("remove_from_node: attempt to delete
non-existant element from prefix tree")
2018-07-02 15:17:53 DB closed
I saw the same thing happen. I stopped SKS, dumped my existing keys
to the dump directory ("/usr/sbin/sks dump 32768 /var/lib/sks/dump"),
tweaked the /etc/sks/sksconf file to include "pagesize: 32" and
"ptree_pagesize: 16", removed the DB and PTree directories, then
/usr/sbin/sks build /var/lib/sks/dump/*.pgp -n 1 -cache 100
/usr/sbin/sks cleandb
/usr/sbin/sks pbuild -cache 50 -ptree_cache 100
SKS restarted fine; so far so good! I'll be keeping an eye on it
over the next few days, so I'll report back as needed.
[...]
Unfortunately the issues is still not resolved. Is nobody else
experiencing this?
Mine's still working...

John
--
John Zaitseff ,--_|\ The ZAP Group
Phone: +61 2 9643 7737 / \ Sydney, Australia
E-mail: ***@zap.org.au \_,--._* http://www.zap.org.au/
v
Keith Erekson
2018-07-17 14:44:53 UTC
Permalink
After the last time I trashed the DB/PTree and rebuilt from a downloaded
dump, I copied the sample "DB_CONFIG" file from the Debian package into
the DB dir, and haven't had any problems since then.

Total disk space used by SKS is ~21GB.

(Sample config is /usr/share/doc/sks/sampleConfig/DB_CONFIG)

~Keith
Post by André Keller
Hi all,
Post by André Keller
Post by John Zaitseff
Post by André Keller
for a few days I have an issue with the recon process on
Failure("remove_from_node: attempt to delete non-existant element
from prefix tree")
2018-07-02 15:17:53 DB closed
I saw the same thing happen. I stopped SKS, dumped my existing keys
to the dump directory ("/usr/sbin/sks dump 32768 /var/lib/sks/dump"),
tweaked the /etc/sks/sksconf file to include "pagesize: 32" and
"ptree_pagesize: 16", removed the DB and PTree directories, then
/usr/sbin/sks build /var/lib/sks/dump/*.pgp -n 1 -cache 100
/usr/sbin/sks cleandb
/usr/sbin/sks pbuild -cache 50 -ptree_cache 100
SKS restarted fine; so far so good! I'll be keeping an eye on it
over the next few days, so I'll report back as needed.
thank you for your reply, I have done that as-well and it is running
stable now since a few hours. Let's see for how long :-)
Unfortunately the issues is still not resolved. Is nobody else
experiencing this?
_______________________________________________
Sks-devel mailing list
https://lists.nongnu.org/mailman/listinfo/sks-devel
John Zaitseff
2018-07-17 19:51:07 UTC
Permalink
Hi, everyone,
Post by Keith Erekson
After the last time I trashed the DB/PTree and rebuilt from a
downloaded dump, I copied the sample "DB_CONFIG" file from the
Debian package into the DB dir, and haven't had any problems since
then.
Ah, yes, I forgot to mention that I had done this as well.
Actually, I just created a DB_CONFIG file in /var/lib/sks/DB with
one line, "set_flags DB_LOG_AUTOREMOVE". I put the same file in
/var/lib/sks/PTree as well, although I don't think it's needed
there.

Yours truly,

John Zaitseff
--
John Zaitseff ,--_|\ The ZAP Group
Phone: +61 2 9643 7737 / \ Sydney, Australia
E-mail: ***@zap.org.au \_,--._* http://www.zap.org.au/
v
André Keller
2018-07-18 13:24:04 UTC
Permalink
Hi all,
Post by John Zaitseff
Post by Keith Erekson
After the last time I trashed the DB/PTree and rebuilt from a
downloaded dump, I copied the sample "DB_CONFIG" file from the
Debian package into the DB dir, and haven't had any problems since
then.
Ah, yes, I forgot to mention that I had done this as well.
Actually, I just created a DB_CONFIG file in /var/lib/sks/DB with
one line, "set_flags DB_LOG_AUTOREMOVE". I put the same file in
/var/lib/sks/PTree as well, although I don't think it's needed
there.
Thank you guys for the hint, I actually forgot to re-add this after
recreating DB and PTree directories. Let's see if this helps.

Regards
André

Loading...