Discussion:
[Sks-devel] Emergency Maintenance: sks.mirror.square-r00t.net
brent s.
2017-12-10 16:41:13 UTC
Permalink
Hey all-

I will shortly (within the next 20 minutes) be bringing
sks.mirror.square-r00t.net down for maintenance - expansion of storage
space and increased RAM.

I do not expect the outage to last more than an hour at conservative
guess, more realistic estimate is ~15-20 minutes.
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
brent s.
2017-12-10 17:03:59 UTC
Permalink
Post by brent s.
Hey all-
I will shortly (within the next 20 minutes) be bringing
sks.mirror.square-r00t.net down for maintenance - expansion of storage
space and increased RAM.
I do not expect the outage to last more than an hour at conservative
guess, more realistic estimate is ~15-20 minutes.
Maintenance period as concluded successfully.
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
brent s.
2017-12-10 17:07:15 UTC
Permalink
Post by brent s.
Post by brent s.
Hey all-
I will shortly (within the next 20 minutes) be bringing
sks.mirror.square-r00t.net down for maintenance - expansion of storage
space and increased RAM.
I do not expect the outage to last more than an hour at conservative
guess, more realistic estimate is ~15-20 minutes.
Maintenance period as concluded successfully.
spoke too soon.

Dec 10 17:05:04 mirror.square-r00t.net systemd[1]: Started Synchronizing
key server db instance.
Dec 10 17:05:07 mirror.square-r00t.net sks[2798]: Fatal error: exception
Keydb.Unsafe.No_db
Dec 10 17:05:07 mirror.square-r00t.net systemd[1]: sks-db.service: Main
process exited, code=exited, status=2/INVALIDARGUMENT
Dec 10 17:05:07 mirror.square-r00t.net systemd[1]: sks-db.service:
Failed with result 'exit-code'.

has anyone seen that "Fatal error: exception Keydb.Unsafe.No_db" before
(esp. after growing a filesystem)? I don't seem to have any other
inconsistent data on the filesystem from what i can tell
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
brent s.
2017-12-10 17:50:20 UTC
Permalink
Post by brent s.
Post by brent s.
Post by brent s.
Hey all-
I will shortly (within the next 20 minutes) be bringing
sks.mirror.square-r00t.net down for maintenance - expansion of storage
space and increased RAM.
I do not expect the outage to last more than an hour at conservative
guess, more realistic estimate is ~15-20 minutes.
Maintenance period as concluded successfully.
spoke too soon.
Dec 10 17:05:04 mirror.square-r00t.net systemd[1]: Started Synchronizing
key server db instance.
Dec 10 17:05:07 mirror.square-r00t.net sks[2798]: Fatal error: exception
Keydb.Unsafe.No_db
Dec 10 17:05:07 mirror.square-r00t.net systemd[1]: sks-db.service: Main
process exited, code=exited, status=2/INVALIDARGUMENT
Failed with result 'exit-code'.
has anyone seen that "Fatal error: exception Keydb.Unsafe.No_db" before
(esp. after growing a filesystem)? I don't seem to have any other
inconsistent data on the filesystem from what i can tell
Decided to just rebuild the DB from scratch. Will be down for a bit.
Sorry, peers et. al.!
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
brent s.
2017-12-10 22:09:33 UTC
Permalink
Post by brent s.
Post by brent s.
has anyone seen that "Fatal error: exception Keydb.Unsafe.No_db" before
(esp. after growing a filesystem)? I don't seem to have any other
inconsistent data on the filesystem from what i can tell
Decided to just rebuild the DB from scratch. Will be down for a bit.
Sorry, peers et. al.!
services have been restored and from a fresh upstream dump, at that.
recon's running. thanks all.
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
Kristian Fiskerstrand
2017-12-10 22:15:09 UTC
Permalink
Post by brent s.
Post by brent s.
Post by brent s.
has anyone seen that "Fatal error: exception Keydb.Unsafe.No_db" before
(esp. after growing a filesystem)? I don't seem to have any other
inconsistent data on the filesystem from what i can tell
Decided to just rebuild the DB from scratch. Will be down for a bit.
Sorry, peers et. al.!
services have been restored and from a fresh upstream dump, at that.
recon's running. thanks all.
Good that things are restored, but to try to debug this more generally,
can you confirm you used fastbuild rather than a full build originally?
In that case the offsets referenced can have been changed during this
process, and the behavior being within the expected behavior.
--
----------------------------
Kristian Fiskerstrand
Blog: https://blog.sumptuouscapital.com
Twitter: @krifisk
----------------------------
Public OpenPGP keyblock at hkp://pool.sks-keyservers.net
fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3
----------------------------
"History doesn't repeat itself, but it does rhyme."
(Mark Twain)
brent s.
2017-12-10 22:20:24 UTC
Permalink
Post by Kristian Fiskerstrand
Good that things are restored, but to try to debug this more generally,
can you confirm you used fastbuild rather than a full build originally?
full build has always been used, both in the original turnup and in this
new turnup.
Post by Kristian Fiskerstrand
In that case the offsets referenced can have been changed during this
process, and the behavior being within the expected behavior.
N/A
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
Kristian Fiskerstrand
2017-12-10 22:26:41 UTC
Permalink
Post by brent s.
Post by Kristian Fiskerstrand
Good that things are restored, but to try to debug this more generally,
can you confirm you used fastbuild rather than a full build originally?
full build has always been used, both in the original turnup and in this
new turnup.
Post by Kristian Fiskerstrand
In that case the offsets referenced can have been changed during this
process, and the behavior being within the expected behavior.
N/A
In that case I'm surprised the expansion of disk store didn't work,
which isn't a big problem for the general keyserver in the global
gossiping network, but it _could_ cause issues for stand-alone servers.
So definitely not good.
--
----------------------------
Kristian Fiskerstrand
Blog: https://blog.sumptuouscapital.com
Twitter: @krifisk
----------------------------
Public OpenPGP keyblock at hkp://pool.sks-keyservers.net
fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3
----------------------------
"Success is getting what you want. Happiness is wanting what you get"
(Dale Carnegie)
brent s.
2017-12-10 22:34:52 UTC
Permalink
Post by Kristian Fiskerstrand
In that case I'm surprised the expansion of disk store didn't work,
which isn't a big problem for the general keyserver in the global
gossiping network, but it _could_ cause issues for stand-alone servers.
So definitely not good.
well, the disk expansion itself DID work. ;) it was low-level; partition
and filesystem size, particularly, using ext4 (i.e. non-pooling filesystem).

yeah... even more oddly was that SKS seemed to be the only thing affected.

so, i'm puzzled, but i'm willing to write it up to a fluke. most likely
the service just didn't shut down cleanly enough before the partition
resizing.
--
brent saner
https://square-r00t.net/
GPG info: https://square-r00t.net/gpg-info
Loading...