2019-06-20
§
|
10:10 |
<ema@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
09:58 |
<_joe_> |
upgraded service-checker T225707 |
[production] |
09:56 |
<ema@cumin1001> |
END (FAIL) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=99) |
[production] |
09:51 |
<ema@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
09:50 |
<ema@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
09:44 |
<ema@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
09:30 |
<ema@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
09:25 |
<marostegui> |
Remove dbprov1001:/srv/backups/tmp/db1112 - T225981 |
[production] |
09:24 |
<ema@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
09:21 |
<ema@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
09:17 |
<ema@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
09:17 |
<ema> |
cache nodes: resume rolling reboots for kernel and varnish upgrades T224694 T225998 T226048 |
[production] |
08:39 |
<marostegui> |
Stop Mysql on db1124: s1, s3, s5 and s8 to upgrade mysql, this will generate lag on labs |
[production] |
07:59 |
<marostegui> |
Stop MYSQL and reboot db2084 |
[production] |
07:15 |
<marostegui> |
Transfer dbprov1001:/srv/backups/tmp/db1112/sqldata to db1077 T225981 |
[production] |
07:00 |
<moritzm> |
installing intel-microcode updates to June 2019 release (microcode is unmodified for most CPUs except for Sandybridge/Core-X models) |
[production] |
06:43 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool and remove from config db1077 T225981 (duration: 00m 54s) |
[production] |
06:31 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: More traffic to db1112 in s3 T225981 (duration: 00m 56s) |
[production] |
06:18 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
06:18 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
06:18 |
<moritzm> |
rebooting sarin for some tests with updated intel-microcode for MDS (also covering Sandybridge server CPUs initially not supported by Intel) |
[production] |
06:16 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: More traffic to db1112 in s3 T225981 (duration: 00m 55s) |
[production] |
06:09 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: More traffic to db1112 in s3 T225981 (duration: 00m 57s) |
[production] |
05:54 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: More traffic to db1112 in s3 T225981 (duration: 00m 56s) |
[production] |
05:40 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: More traffic to db1112 in s3 T225981 (duration: 00m 56s) |
[production] |
05:37 |
<marostegui> |
Deploy schema change on centralauth.oathauth_users T225643 |
[production] |
05:23 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Slowly pool db1112 into s3 T225981 (duration: 00m 55s) |
[production] |
05:22 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-codfw.php: Slowly pool db1112 into s3 T225981 (duration: 00m 55s) |
[production] |
05:04 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool db1077 T225981 (duration: 00m 55s) |
[production] |
04:53 |
<marostegui> |
Stop replication in sync on db1112 and db1077 to move db1124 under db1112 - T225981 |
[production] |
04:52 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool db1077 T225981 (duration: 00m 59s) |
[production] |
04:00 |
<onimisionipe> |
depooling maps1003 for reimage into new partition scheme - T224395 |
[production] |
2019-06-19
§
|
18:09 |
<legoktm> |
added MatmaRex to extension-VisualEditor-staff Gerrit group |
[production] |
16:50 |
<moritzm> |
running racreset on multatuli |
[production] |
16:50 |
<XioNoX> |
rollback redirect ns0 to authdns2001 |
[production] |
16:45 |
<moritzm> |
rebooting authdns1001 for kernel security update |
[production] |
16:42 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
16:42 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
16:39 |
<XioNoX> |
redirect ns0 to authdns2001 |
[production] |
16:37 |
<XioNoX> |
rollback redirect ns1 to authdns1001 |
[production] |
16:34 |
<moritzm> |
rebooting authdns2001 for kernel security update |
[production] |
16:28 |
<XioNoX> |
redirect ns1 to authdns1001 |
[production] |
16:26 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
16:26 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
16:23 |
<onimisionipe> |
pooling elastic1029 - T214283 |
[production] |
16:01 |
<ema> |
cache nodes: stop rolling reboots for today, 47/80 done T224694 T225998 |
[production] |
15:43 |
<reedy@deploy1001> |
rebuilt and synchronized wikiversions files: group0 back to .8 T226109 |
[production] |
15:43 |
<ema@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
15:40 |
<ema@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
15:37 |
<onimisionipe> |
pooled maps1002 - postgres init is complete and successfully joined to its cluster - T224395 |
[production] |