2021-01-11
§
|
09:31 |
<marostegui> |
Sanitize db1155:3314 - T268742 |
[production] |
09:31 |
<marostegui> |
Deploy schema change on s1 codfw master - T270187 |
[production] |
09:02 |
<elukey> |
force puppet on logstash1007 after ES OOM |
[production] |
08:54 |
<godog> |
swift codfw-prod: more weight to ms-be20[58-61] - T269337 |
[production] |
08:24 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1030.eqiad.wmnet with reason: REIMAGE |
[production] |
08:22 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2030.codfw.wmnet with reason: REIMAGE |
[production] |
08:20 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc1030.eqiad.wmnet with reason: REIMAGE |
[production] |
08:19 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2030.codfw.wmnet with reason: REIMAGE |
[production] |
07:49 |
<dcausse> |
depooling & restarting blazegraph on wdqs2007 (T242453) |
[production] |
07:48 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1136 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P13709 and previous config saved to /var/cache/conftool/dbconfig/20210111-074853-root.json |
[production] |
07:43 |
<dcausse> |
repool wdqs1007 (wrong machine) (T242453) |
[production] |
07:41 |
<dcausse> |
depooling & restarting blazegraph on wdqs1007 (T242453) |
[production] |
07:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1136 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P13708 and previous config saved to /var/cache/conftool/dbconfig/20210111-073349-root.json |
[production] |
07:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1136 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P13707 and previous config saved to /var/cache/conftool/dbconfig/20210111-071846-root.json |
[production] |
07:12 |
<marostegui> |
Deploy schema change on s8 codfw master - T270187 |
[production] |
07:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1136 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P13706 and previous config saved to /var/cache/conftool/dbconfig/20210111-070342-root.json |
[production] |
06:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1136', diff saved to https://phabricator.wikimedia.org/P13704 and previous config saved to /var/cache/conftool/dbconfig/20210111-065640-marostegui.json |
[production] |
06:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1079 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P13703 and previous config saved to /var/cache/conftool/dbconfig/20210111-065550-root.json |
[production] |
06:40 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1079 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P13702 and previous config saved to /var/cache/conftool/dbconfig/20210111-064046-root.json |
[production] |
06:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P13701 and previous config saved to /var/cache/conftool/dbconfig/20210111-063226-marostegui.json |
[production] |
06:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1094', diff saved to https://phabricator.wikimedia.org/P13700 and previous config saved to /var/cache/conftool/dbconfig/20210111-063155-marostegui.json |
[production] |
06:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1094', diff saved to https://phabricator.wikimedia.org/P13699 and previous config saved to /var/cache/conftool/dbconfig/20210111-063124-marostegui.json |
[production] |
06:04 |
<marostegui> |
Depool db1121 to clone db1155:3314 |
[production] |
06:04 |
<marostegui> |
Deploy schema change on s7 codfw master - T270187 |
[production] |
06:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P13698 and previous config saved to /var/cache/conftool/dbconfig/20210111-060342-marostegui.json |
[production] |
2021-01-08
§
|
19:48 |
<andrew@deploy1001> |
Finished deploy [striker/deploy@e4db843]: Striker deploy for T269004 (duration: 02m 11s) |
[production] |
19:45 |
<andrew@deploy1001> |
Started deploy [striker/deploy@e4db843]: Striker deploy for T269004 |
[production] |
19:28 |
<andrew@deploy1001> |
Finished deploy [horizon/deploy@7466703]: Horizon with a bunch of Buster patches (duration: 02m 35s) |
[production] |
19:26 |
<andrew@deploy1001> |
Started deploy [horizon/deploy@7466703]: Horizon with a bunch of Buster patches |
[production] |
18:02 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@db9da3c] (thin): Hotfix analytics deployment - THIN [analytics/refinery@db9da3c] (duration: 00m 07s) |
[production] |
18:02 |
<joal@deploy1001> |
Started deploy [analytics/refinery@db9da3c] (thin): Hotfix analytics deployment - THIN [analytics/refinery@db9da3c] |
[production] |
18:01 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@db9da3c]: Hotfix analytics deployment [analytics/refinery@db9da3c] (duration: 11m 27s) |
[production] |
17:50 |
<joal@deploy1001> |
Started deploy [analytics/refinery@db9da3c]: Hotfix analytics deployment [analytics/refinery@db9da3c] |
[production] |
17:33 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled |
[production] |
17:33 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled |
[production] |
17:15 |
<hnowlan@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled |
[production] |
17:15 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled |
[production] |
17:15 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on maps1009.eqiad.wmnet with reason: Downtiming while not pooled |
[production] |
17:15 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on maps1009.eqiad.wmnet with reason: Downtiming while not pooled |
[production] |
17:10 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on labweb1001.wikimedia.org with reason: REIMAGE |
[production] |
17:08 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on labweb1001.wikimedia.org with reason: REIMAGE |
[production] |
16:50 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:43 |
<razzi@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
16:42 |
<andrewbogott> |
shutting down labweb1001 so I can really believe that all traffic is being served by 1002 |
[production] |
16:35 |
<andrew@deploy1001> |
Finished deploy [horizon/deploy@7466703]: selective disable of problematic compression block (duration: 01m 42s) |
[production] |
16:33 |
<andrew@deploy1001> |
Started deploy [horizon/deploy@7466703]: selective disable of problematic compression block |
[production] |
16:32 |
<andrew@deploy1001> |
Finished deploy [horizon/deploy@7466703]: selective disable of problematic compression block (duration: 01m 52s) |
[production] |
16:30 |
<razzi@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
16:30 |
<andrew@deploy1001> |
Started deploy [horizon/deploy@7466703]: selective disable of problematic compression block |
[production] |