3251-3300 of 10000 results (58ms)
2022-03-19 §
03:51 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
03:51 <andrew@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
03:29 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
03:28 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
03:28 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
03:28 <andrew@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
03:18 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
02:52 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
02:27 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
02:10 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
01:58 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1149 (T300775)', diff saved to https://phabricator.wikimedia.org/P22839 and previous config saved to /var/cache/conftool/dbconfig/20220319-015847-marostegui.json [production]
01:58 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance [production]
01:58 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance [production]
01:58 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300775)', diff saved to https://phabricator.wikimedia.org/P22838 and previous config saved to /var/cache/conftool/dbconfig/20220319-015839-marostegui.json [production]
01:49 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1016.eqiad.wmnet with reason: host reimage [production]
01:46 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1016.eqiad.wmnet with reason: host reimage [production]
01:43 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P22837 and previous config saved to /var/cache/conftool/dbconfig/20220319-014334-marostegui.json [production]
01:34 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
01:28 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P22836 and previous config saved to /var/cache/conftool/dbconfig/20220319-012829-marostegui.json [production]
01:23 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
01:13 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300775)', diff saved to https://phabricator.wikimedia.org/P22835 and previous config saved to /var/cache/conftool/dbconfig/20220319-011324-marostegui.json [production]
00:58 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
00:55 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
2022-03-18 §
21:16 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1016.eqiad.wmnet with reason: host reimage [production]
21:12 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1016.eqiad.wmnet with reason: host reimage [production]
21:02 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye [production]
15:38 <jayme> powercycle kubernetes1002 [production]
14:43 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
14:30 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
14:26 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.26/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: [[gerrit:771907|Don't pass the revision to PO access service (T304127)]] (duration: 00m 49s) [production]
14:12 <XioNoX> configure NAT for civi1002 - T304098 [production]
14:02 <kharlan@deploy1002> helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply [production]
14:02 <kharlan@deploy1002> helmfile [codfw] START helmfile.d/services/linkrecommendation: apply [production]
14:01 <kharlan@deploy1002> helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply [production]
14:01 <kharlan@deploy1002> helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply [production]
13:59 <kharlan@deploy1002> helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply [production]
13:59 <kharlan@deploy1002> helmfile [staging] START helmfile.d/services/linkrecommendation: apply [production]
13:08 <jbond@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "test sync - jbond@cumin1001" [production]
13:07 <jbond@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "test sync - jbond@cumin1001" [production]
13:02 <moritzm> imported python3.5 3.5.3-1+deb9u5+wmf1 to component/python35 T303801 [production]
12:35 <btullis@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. [production]
11:35 <kharlan@deploy1002> helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply [production]
11:33 <kharlan@deploy1002> helmfile [codfw] START helmfile.d/services/linkrecommendation: apply [production]
11:32 <kharlan@deploy1002> helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply [production]
11:30 <kharlan@deploy1002> helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply [production]
11:29 <kharlan@deploy1002> helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply [production]
11:28 <kharlan@deploy1002> helmfile [staging] START helmfile.d/services/linkrecommendation: apply [production]
11:09 <vgutierrez> rolling restart of nginx on ncredir instances to catch up on OpenSSL updates [production]
11:05 <vgutierrez> restarting acme-chief and acme-chief API services to catch up on OpenSSL updates [production]
10:58 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1007.eqiad.wmnet [production]