7901-7950 of 10000 results (103ms)
2023-04-25 ยง
15:22 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on irc2002.wikimedia.org with reason: Non-functional, WIP for Bullseye update [production]
15:22 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on irc2002.wikimedia.org with reason: Non-functional, WIP for Bullseye update [production]
15:22 <claime> Datacenter Service Switchback concluded - T335015 [production]
15:21 <cgoubert@deploy1002> Synchronized README: check the deployment server after switchback - T335015 (duration: 19m 55s) [production]
15:19 <cgoubert@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors [production]
15:19 <cgoubert@cumin2002> START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors [production]
15:19 <cgoubert@cumin2002> START - Cookbook sre.discovery.service-route depool restbase-async in eqiad: T335015 [production]
15:18 <claime> Restoring restbase-async to codfw only - T335015 [production]
15:18 <cgoubert@deploy1002> Finished deploy [restbase/deploy@a08f56d]: (no justification provided) (duration: 13m 06s) [production]
15:08 <herron@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS bullseye [production]
15:05 <cgoubert@deploy1002> Started deploy [restbase/deploy@a08f56d]: (no justification provided) [production]
15:02 <btullis@cumin1001> START - Cookbook sre.wikireplicas.add-wiki [production]
15:02 <inflatador> [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` [production]
15:02 <btullis@cumin1001> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) [production]
15:02 <btullis@cumin1001> Added views for new wiki: vewikimedia T330704 [production]
15:01 <btullis@cumin1001> START - Cookbook sre.wikireplicas.add-wiki [production]
15:01 <btullis@cumin1001> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) [production]
15:01 <btullis@cumin1001> Added views for new wiki: ckbwiktionary T331834 [production]
15:01 <btullis@cumin1001> START - Cookbook sre.wikireplicas.add-wiki [production]
15:00 <btullis@cumin1001> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) [production]
15:00 <btullis@cumin1001> Added views for new wiki: fatwiki T335018 [production]
15:00 <btullis@cumin1001> START - Cookbook sre.wikireplicas.add-wiki [production]
15:00 <btullis@cumin1001> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) [production]
15:00 <btullis@cumin1001> Added views for new wiki: kcgwiktionary T334739 [production]
15:00 <btullis@cumin1001> START - Cookbook sre.wikireplicas.add-wiki [production]
14:59 <btullis@cumin1001> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) [production]
14:59 <btullis@cumin1001> Added views for new wiki: guwwikinews T334408 [production]
14:59 <btullis@cumin1001> START - Cookbook sre.wikireplicas.add-wiki [production]
14:58 <bking@deploy1002> Finished deploy [wdqs/wdqs@0e051d8]: 0.3.123 (duration: 07m 38s) [production]
14:54 <cgoubert@deploy2002> Unlocked for deployment [ALL REPOSITORIES]: Datacenter Service Switchback - T335015 (duration: 81m 19s) [production]
14:51 <herron@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage [production]
14:50 <bking@deploy1002> Started deploy [wdqs/wdqs@0e051d8]: 0.3.123 [production]
14:48 <herron@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage [production]
14:45 <claime> Running authdns-update - T335015 [production]
14:45 <inflatador> [WDQS Deploy] Gearing up for deploy of wdqs `0.3.123`. Pre-deploy tests passing on canary `wdqs1003` [production]
14:44 <claime> Switch deployment server back to eqiad - T335015 [production]
14:43 <claime> All active/active services repooled in codfw - T335015 [production]
14:43 <cgoubert@cumin1001> END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in codfw: Datacenter Services Switchback - T335015 [production]
14:36 <herron@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS bullseye [production]
14:35 <herron@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS bullseye [production]
14:26 <cgoubert@cumin1001> START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: Datacenter Services Switchback - T335015 [production]
14:26 <claime> All services pooled in eqiad, all depooled in codfw, proceeding with repooling active/active services in codfw - T335015 [production]
14:25 <cgoubert@cumin1001> END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None [production]
14:25 <cgoubert@cumin1001> START - Cookbook sre.discovery.datacenter status all services in all: None - None [production]
14:24 <cgoubert@cumin1001> END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Services Switchback - T335015 [production]
14:19 <cgoubert@cumin1001> START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015 [production]
14:19 <herron@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage [production]
14:18 <cgoubert@cumin1001> END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) depool all services in codfw: Datacenter Services Switchback - T335015 [production]
14:16 <herron@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage [production]
14:04 <cgoubert@cumin1001> START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015 [production]