251-300 of 10000 results (21ms)
2021-04-08 ยง
18:03 <mutante> mw2403 through mw2411 - new hardware moving into production, not pooled yet, initial puppet run, being added to icinga etc, creating mcrouter certs for them (T279599) [production]
18:02 <mutante> mw2403 through mw2401 - new hardwere moving into production, not pooled yet, initial puppet run, being added to icinga etc, creating mcrouter certs for them (T279599) [production]
17:59 <tgr@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' . [production]
17:52 <ryankemper@cumin2001> END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) [production]
17:29 <jgiannelos@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:23 <jgiannelos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:18 <jgiannelos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:16 <dancy> Scap 3.17.0 deployed to beta cluster [production]
16:51 <dancy> testing Scap 3.17.0 release on deployment-deploy01 [production]
16:33 <elukey> reboot an-worker1100 again to check if all the disks come up correctly [production]
16:16 <cmjohnson1> update bios cp1087, already deposed for h/w issues T278729 [production]
16:15 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1025.eqiad.wmnet with reason: REIMAGE [production]
16:13 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1025.eqiad.wmnet with reason: REIMAGE [production]
16:10 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:05 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
15:51 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:44 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
15:36 <elukey> reboot an-worker1100 to see if it helps with the strange BBU behavior [production]
13:55 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon2001-dev.codfw.wmnet [production]
13:44 <andrew@cumin1001> START - Cookbook sre.hosts.decommission for hosts cloudcephmon2001-dev.codfw.wmnet [production]
13:41 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2001.codfw.wmnet with reason: REIMAGE [production]
13:39 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on parse2001.codfw.wmnet with reason: REIMAGE [production]
13:24 <moritzm> installing groff bugfix updates from Buster point release [production]
12:49 <ema> cp5001: varnish-frontend-restart to test exp policy settings starting from a empty cache T275809 [production]
12:44 <moritzm> installing libbsd security updates for Buster [production]
12:39 <moritzm> installing xcftools security updates [production]
12:31 <marostegui@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15264 and previous config saved to /var/cache/conftool/dbconfig/20210408-123137-root.json [production]
12:16 <marostegui@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15263 and previous config saved to /var/cache/conftool/dbconfig/20210408-121633-root.json [production]
12:01 <marostegui@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15262 and previous config saved to /var/cache/conftool/dbconfig/20210408-120128-root.json [production]
11:58 <XioNoX> tighten all routers loopback firewall filter - T207799 [production]
11:57 <zpapierski@deploy1002> Finished deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix (duration: 00m 09s) [production]
11:57 <zpapierski@deploy1002> Started deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix [production]
11:50 <XioNoX> tighten cr3-ulsfo loopback firewall filter - T207799 [production]
11:49 <zpapierski@deploy1002> Finished deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix (duration: 01m 39s) [production]
11:47 <zpapierski@deploy1002> Started deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix [production]
11:46 <marostegui@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15261 and previous config saved to /var/cache/conftool/dbconfig/20210408-114625-root.json [production]
11:23 <marostegui@cumin1001> dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15259 and previous config saved to /var/cache/conftool/dbconfig/20210408-112332-root.json [production]
11:09 <filippo@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2028.codfw.wmnet [production]
11:08 <marostegui@cumin1001> dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15258 and previous config saved to /var/cache/conftool/dbconfig/20210408-110828-root.json [production]
11:07 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: de1670cbd2c59a24f1e29a6d3731e3ac7f39d336: Enable Growth for newcomers on simplewiki, mswiki, tawiki (T278369; T277562; T277550) (duration: 01m 07s) [production]
10:53 <marostegui@cumin1001> dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15257 and previous config saved to /var/cache/conftool/dbconfig/20210408-105324-root.json [production]
10:47 <effie> disable puppet on parsoid* servers [production]
10:41 <XioNoX> enable sampling on all routers FPCs [production]
10:40 <marostegui> Upgrade db2085's kernel [production]
10:38 <marostegui@cumin1001> dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15256 and previous config saved to /var/cache/conftool/dbconfig/20210408-103821-root.json [production]
10:37 <mvolz@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
10:32 <XioNoX> enable sampling on cr1-codfw:fpc0 [production]
10:30 <marostegui> Upgrade kernel on db1118 [production]
10:28 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1118 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15255 and previous config saved to /var/cache/conftool/dbconfig/20210408-102855-marostegui.json [production]
10:27 <effie> enable puppet on all mw* servers [production]