551-600 of 10000 results (26ms)
2020-06-27 §
17:25 <hashar> Disabled beta cluster update job (gerrit maintenance) https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/ [production]
17:19 <qchris> Stopping gerrit on gerrit1001 for the Gerrit upgrade [production]
17:14 <qchris> Duplicating reviewdb changes so we get a cheap and quick rollback [production]
17:11 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:11 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:11 <qchris> Disabling puppet on gerrit1001 for Gerrit upgrades + data migrations [production]
17:11 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:11 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:07 <qchris> Starting Gerrit upgrade to v3.2.2-98-g98d827eaa3 [production]
15:44 <qchris@deploy1001> Finished deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit1002 (gerrit-test) (duration: 00m 08s) [production]
15:44 <qchris@deploy1001> Started deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit1002 (gerrit-test) [production]
13:03 <qchris@deploy1001> Finished deploy [gerrit/gerrit@460e439]: Gerrit to v3.2.2-97-gcaf5020db1 on gerrit1002 (gerrit-test) (duration: 00m 08s) [production]
13:03 <qchris@deploy1001> Started deploy [gerrit/gerrit@460e439]: Gerrit to v3.2.2-97-gcaf5020db1 on gerrit1002 (gerrit-test) [production]
2020-06-26 §
18:42 <robh> all ulsfo onsite work completed as of 30 minutes ago [production]
17:52 <robh> msw2-ulsfo work done, all mgmt items confirmed back online and icinga alerts cleared, moving onto msw1-ulsfo (rack 22) and will lose all mgmt in that rack for next 10-20 minutes T256300 [production]
17:52 <robh> msw2-ulsfo work done, all mgmt items confirmed back online and icinga alerts cleared, moving onto msw1-ulsfo (rack 22) and will lose all mgmt in that rack for next 10-20 minutes [production]
17:11 <robh> msw work in ulsfo via T256300 [production]
10:24 <ema> pool 5006 T256449 [production]
10:22 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1085', diff saved to https://phabricator.wikimedia.org/P11677 and previous config saved to /var/cache/conftool/dbconfig/20200626-102248-marostegui.json [production]
10:22 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1093', diff saved to https://phabricator.wikimedia.org/P11676 and previous config saved to /var/cache/conftool/dbconfig/20200626-102201-marostegui.json [production]
10:03 <ema> cp2039: restart purged T256444 [production]
09:57 <ema> cp2037: restart purged T256444 [production]
09:55 <ema> cp1087: restart purged T256444 [production]
09:46 <ema> cp2033: restart purged T256444 [production]
09:38 <akosiaris> move the sessionstore eqiad pods back to the dedicated sessionstore nodes [production]
09:37 <akosiaris@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
09:35 <akosiaris> move the sessionstore codfw pods back to the dedicated sessionstore nodes [production]
09:35 <akosiaris@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
09:08 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1093 for schema change', diff saved to https://phabricator.wikimedia.org/P11675 and previous config saved to /var/cache/conftool/dbconfig/20200626-090813-marostegui.json [production]
08:58 <jynus@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
08:56 <jynus@cumin1001> START - Cookbook sre.hosts.downtime [production]
08:33 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1088', diff saved to https://phabricator.wikimedia.org/P11674 and previous config saved to /var/cache/conftool/dbconfig/20200626-083319-marostegui.json [production]
08:25 <ayounsi@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
08:22 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1088 for schema change', diff saved to https://phabricator.wikimedia.org/P11673 and previous config saved to /var/cache/conftool/dbconfig/20200626-082242-marostegui.json [production]
08:20 <ayounsi@cumin1001> START - Cookbook sre.dns.netbox [production]
08:20 <ayounsi@cumin1001> END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) [production]
08:05 <akosiaris@cumin1001> conftool action : set/pooled=yes; selector: name=kubernetes.*.wmnet [production]
08:04 <akosiaris@cumin1001> conftool action : set/weight=10; selector: name=kubernetes.*.wmnet [production]
08:04 <akosiaris> pool all new kubernetes nodes in LVS T252185 T256236 [production]
07:57 <ayounsi@cumin1001> START - Cookbook sre.dns.netbox [production]
07:44 <volans> force rebooted cp5006 that is unresponsive (after having depooled it) - T256449 [production]
07:42 <volans@cumin1001> conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet [production]
06:40 <tstarling@deploy1001> Synchronized wmf-config/InitialiseSettings.php: add cache-cookies log channel (duration: 00m 59s) [production]
05:13 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2088:3312, db2104', diff saved to https://phabricator.wikimedia.org/P11672 and previous config saved to /var/cache/conftool/dbconfig/20200626-051328-marostegui.json [production]
05:06 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
05:03 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
04:01 <cdanis> re-enable puppet on cps [production]
03:54 <cdanis> ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕛🍺 sudo cumin A:cp 'disable-puppet "I39e1c68a is broken"' [production]
03:54 <cdanis> https://gerrit.wikimedia.org/r/c/operations/puppet/+/607917 [production]
02:52 <tstarling@deploy1001> Synchronized private/PrivateSettings.php: updating wgAuthenticationTokenVersion per my wikitech-l post (duration: 00m 57s) [production]