251-300 of 10000 results (33ms)
2021-07-06 §
14:16 <ottomata> restarted aqs for july mw histroy snapshot deploy [analytics]
14:00 <marostegui@cumin1001> dbctl commit (dc=all): 'db2072 (re)pooling @ 100%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16777 and previous config saved to /var/cache/conftool/dbconfig/20210706-140049-root.json [production]
13:53 <otto@cumin1001> END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) [production]
13:49 <otto@cumin1001> START - Cookbook sre.aqs.roll-restart [production]
13:49 <otto@cumin1001> END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) [production]
13:49 <otto@cumin1001> START - Cookbook sre.aqs.roll-restart [production]
13:45 <marostegui@cumin1001> dbctl commit (dc=all): 'db2072 (re)pooling @ 75%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16776 and previous config saved to /var/cache/conftool/dbconfig/20210706-134545-root.json [production]
13:30 <marostegui@cumin1001> dbctl commit (dc=all): 'db2072 (re)pooling @ 50%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16775 and previous config saved to /var/cache/conftool/dbconfig/20210706-133041-root.json [production]
13:29 <joal> Run first manual empty job for webrequest_test on analytics-test-hadoop [analytics]
13:29 <joal> Clean gobblin state_store and data before starting webrequest_test on analytics-test-hadoop [analytics]
13:15 <marostegui@cumin1001> dbctl commit (dc=all): 'db2072 (re)pooling @ 25%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16774 and previous config saved to /var/cache/conftool/dbconfig/20210706-131537-root.json [production]
12:02 <marostegui@cumin1001> dbctl commit (dc=all): 'db2071 (re)pooling @ 100%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16773 and previous config saved to /var/cache/conftool/dbconfig/20210706-120242-root.json [production]
11:58 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2072', diff saved to https://phabricator.wikimedia.org/P16772 and previous config saved to /var/cache/conftool/dbconfig/20210706-115820-marostegui.json [production]
11:57 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1118', diff saved to https://phabricator.wikimedia.org/P16771 and previous config saved to /var/cache/conftool/dbconfig/20210706-115732-marostegui.json [production]
11:47 <marostegui@cumin1001> dbctl commit (dc=all): 'db2071 (re)pooling @ 75%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16770 and previous config saved to /var/cache/conftool/dbconfig/20210706-114739-root.json [production]
11:32 <marostegui@cumin1001> dbctl commit (dc=all): 'db2071 (re)pooling @ 50%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16769 and previous config saved to /var/cache/conftool/dbconfig/20210706-113235-root.json [production]
11:17 <marostegui@cumin1001> dbctl commit (dc=all): 'db2071 (re)pooling @ 25%: Repool after index change', diff saved to https://phabricator.wikimedia.org/P16768 and previous config saved to /var/cache/conftool/dbconfig/20210706-111731-root.json [production]
11:16 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2071', diff saved to https://phabricator.wikimedia.org/P16767 and previous config saved to /var/cache/conftool/dbconfig/20210706-111635-marostegui.json [production]
10:19 <moritzm> installing jackson-databind security updates on buster [production]
09:01 <_joe_> repooling wdqs1007 now that lag has caught up [production]
08:43 <moritzm> installing libuv1 security updates on buster [production]
07:18 <wikibugs> Updated channels.yaml to: f23263a69fa3084e9a01bbcbef417c914a1ea93c Remove ##wmt, project archived [tools.wikibugs]
07:12 <wikibugs> Updated channels.yaml to: 8e21ef1b7237c7e3ebff38ea999bd78dc16b5329 Remove #wikimedia-teampractices, project archived [tools.wikibugs]
07:06 <marostegui> Upgrade db1104 kernel [production]
06:54 <moritzm> installing PHP 7.3 securiy updates on buster [production]
06:50 <marostegui> Upgrade db1122 kernel [production]
06:49 <majavah> restart bot after libera ban was removed, looks to be stable again [tools.bridgebot]
06:35 <marostegui> Upgrade db1138 kernel [production]
06:31 <marostegui> Upgrade db1160 kernel [production]
01:16 <legoktm> reloaded zuul for https://gerrit.wikimedia.org/r/703219 [releng]
00:56 <eileen> process-control config revision is 8d46b52ed4 [production]
2021-07-05 §
17:40 <legoktm> published fixed docker-registry.discovery.wmnet/nodejs10-devel:0.0.4 image (T286212) [production]
15:24 <_joe_> leaving wdqs1007 depooled so that the updater can recover faster, now at 16.5 hours of lag [production]
15:02 <Amir1> deployed 703212 (T286058) [releng]
14:50 <majavah> restart bot to have it re-login [tools.hat-collector]
14:01 <moritzm> uploaded nginx 1.13.9-1+wmf3 for stretch-wikimedoa [production]
12:50 <marostegui> Stop MySQL on db1117:3321 to clone db1125 T286042 [production]
11:29 <moritzm> installing openexr security updates on stretch [production]
11:07 <moritzm> installing tiff security updates on stretch [production]
10:48 <moritzm> upgrading PHP on miscweb* [production]
10:37 <jbond> enable puppet fleet wide to post puppetdb change [production]
10:29 <marostegui> Optimize ruwiki.logging on s6 eqiad with replication T286102 [production]
10:27 <jbond> disable puppet fleet wide to preforem puppetdb change [production]
08:15 <moritzm> rolling out debmonitor-client 0.3.0 [production]
08:03 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on releases1002.eqiad.wmnet with reason: bump CPU count [production]
08:03 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 0:30:00 on releases1002.eqiad.wmnet with reason: bump CPU count [production]
07:55 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on releases2002.codfw.wmnet with reason: bump CPU count [production]
07:55 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 0:30:00 on releases2002.codfw.wmnet with reason: bump CPU count [production]
07:04 <_joe_> restarting blazegraph, then restarting the updater again [production]
06:48 <moritzm> start rasdaemon on sretest1001, didn't start after last reboot from a week ago [production]