2021-05-11
ยง
|
22:14 |
<legoktm@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lists1002.wikimedia.org |
[production] |
22:05 |
<legoktm@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts lists1002.wikimedia.org |
[production] |
21:37 |
<Urbanecm> |
Start server-side upload for 3 video files (T282566, T282565, T282559) |
[production] |
21:37 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash1012.eqiad.wmnet with reason: REIMAGE |
[production] |
21:34 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on logstash1012.eqiad.wmnet with reason: REIMAGE |
[production] |
20:52 |
<legoktm> |
upgraded mailman3 on lists1001 |
[production] |
20:37 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host people2002.codfw.wmnet |
[production] |
20:24 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@270c753] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@270c753fc746b979cf90e1537f9a67ede6372795] (duration: 06m 57s) |
[production] |
20:17 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@270c753] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@270c753fc746b979cf90e1537f9a67ede6372795] |
[production] |
20:17 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@270c753] (thin): Regular analytics weekly train THIN [analytics/refinery@270c753fc746b979cf90e1537f9a67ede6372795] (duration: 00m 05s) |
[production] |
20:17 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@270c753] (thin): Regular analytics weekly train THIN [analytics/refinery@270c753fc746b979cf90e1537f9a67ede6372795] |
[production] |
20:17 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@270c753]: Regular analytics weekly train [analytics/refinery@270c753fc746b979cf90e1537f9a67ede6372795] (duration: 17m 01s) |
[production] |
20:00 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@270c753]: Regular analytics weekly train [analytics/refinery@270c753fc746b979cf90e1537f9a67ede6372795] |
[production] |
19:55 |
<dzahn@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host people2002.codfw.wmnet |
[production] |
19:46 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@7e0598d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] (duration: 09m 45s) |
[production] |
19:37 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@7e0598d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] |
[production] |
19:33 |
<dancy@deploy1002> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.5 |
[production] |
19:29 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@7e0598d] (thin): Regular analytics weekly train THIN [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] (duration: 00m 07s) |
[production] |
19:29 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@7e0598d] (thin): Regular analytics weekly train THIN [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] |
[production] |
19:28 |
<mforns@deploy1002> |
Finished deploy [analytics/refinery@7e0598d]: Regular analytics weekly train [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] (duration: 45m 45s) |
[production] |
18:54 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash1011.eqiad.wmnet with reason: REIMAGE |
[production] |
18:53 |
<otto@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Migrate VirtualPageView to EventPlatform on testwiki - T238138 (duration: 01m 09s) |
[production] |
18:52 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on logstash1011.eqiad.wmnet with reason: REIMAGE |
[production] |
18:43 |
<mforns@deploy1002> |
Started deploy [analytics/refinery@7e0598d]: Regular analytics weekly train [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] |
[production] |
18:20 |
<dancy@deploy1002> |
Finished scap: testwikis wikis to 1.37.0-wmf.5 (duration: 09m 43s) |
[production] |
18:10 |
<dancy@deploy1002> |
Started scap: testwikis wikis to 1.37.0-wmf.5 |
[production] |
17:36 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again) (duration: 01m 25s) |
[production] |
17:35 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash1010.eqiad.wmnet with reason: REIMAGE |
[production] |
17:35 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again) |
[production] |
17:34 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again) (duration: 02m 27s) |
[production] |
17:33 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on logstash1010.eqiad.wmnet with reason: REIMAGE |
[production] |
17:32 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again) |
[production] |
17:31 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@2604d7b]: testing default policy deployment in codfw1dev (duration: 01m 59s) |
[production] |
17:29 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@2604d7b]: testing default policy deployment in codfw1dev |
[production] |
17:20 |
<mutante> |
the backend for people.wikimedia.org switched from people1002 to people1003, the people.wikimedia.org CNAME has been updated. MOTD is about to be updated to inform users. |
[production] |
17:18 |
<legoktm> |
disabled pipermail redirects on lists.wikimedia.org |
[production] |
17:07 |
<dancy@deploy1002> |
scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details) |
[production] |
16:12 |
<jynus> |
restarting bacula-dir on backup1001, stuck process |
[production] |
15:59 |
<dancy@deploy1002> |
rebuilt and synchronized wikiversions files: (no justification provided) |
[production] |
15:58 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwlog1001.eqiad.wmnet |
[production] |
15:55 |
<bstorm> |
restart haproxy on dbproxy1018/9 to remove old config |
[production] |
15:47 |
<herron@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts mwlog1001.eqiad.wmnet |
[production] |
15:38 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwlog2001.codfw.wmnet |
[production] |
15:37 |
<dancy@deploy1002> |
scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details) |
[production] |
15:36 |
<dancy@deploy1002> |
sync-world aborted: testwikis wikis to 1.37.0-wmf.4 (duration: 02m 04s) |
[production] |
15:34 |
<dancy@deploy1002> |
Started scap: testwikis wikis to 1.37.0-wmf.4 |
[production] |
15:33 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:31 |
<dancy@deploy1002> |
scap failed: RuntimeError scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details) (duration: 17m 36s) |
[production] |
15:31 |
<dancy@deploy1002> |
scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details) |
[production] |
15:27 |
<herron@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts mwlog2001.codfw.wmnet |
[production] |