1451-1500 of 10000 results (154ms)
2024-07-01 §
08:15 <marostegui@cumin1002> dbctl commit (dc=all): 'db1169 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P65561 and previous config saved to /var/cache/conftool/dbconfig/20240701-081514-root.json [production]
08:13 <marostegui@cumin1002> dbctl commit (dc=all): 'db1195 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P65560 and previous config saved to /var/cache/conftool/dbconfig/20240701-081307-root.json [production]
08:07 <marostegui@cumin1002> END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1169.eqiad.wmnet onto db1195.eqiad.wmnet [production]
07:44 <elukey> `apt-get clean` on buil2001 to free some space in the root partition [production]
07:02 <marostegui@cumin1002> dbctl commit (dc=all): 'Place db1195 in s1 T368871', diff saved to https://phabricator.wikimedia.org/P65559 and previous config saved to /var/cache/conftool/dbconfig/20240701-070243-marostegui.json [production]
06:36 <marostegui@cumin1002> START - Cookbook sre.mysql.clone of db1169.eqiad.wmnet onto db1195.eqiad.wmnet [production]
06:36 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db1169 T368871', diff saved to https://phabricator.wikimedia.org/P65558 and previous config saved to /var/cache/conftool/dbconfig/20240701-063601-root.json [production]
06:33 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2116 (T364069)', diff saved to https://phabricator.wikimedia.org/P65557 and previous config saved to /var/cache/conftool/dbconfig/20240701-063344-marostegui.json [production]
06:33 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
06:33 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
05:02 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: Reboot [production]
05:02 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: Reboot [production]
04:56 <marostegui> Failover m2 from db1195 to db1228 - T368494 [production]
04:52 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2133,2160].codfw.wmnet,db[1195,1217,1228].eqiad.wmnet with reason: m2 switchover T368494 [production]
04:51 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on db[2133,2160].codfw.wmnet,db[1195,1217,1228].eqiad.wmnet with reason: m2 switchover T368494 [production]
04:50 <marostegui> dbmaint eqiad Rebuild pagelinks table on s8 master T364069 [production]
04:49 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1156 (T367856)', diff saved to https://phabricator.wikimedia.org/P65556 and previous config saved to /var/cache/conftool/dbconfig/20240701-044945-marostegui.json [production]
04:49 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
04:49 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
04:49 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance [production]
04:49 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance [production]
2024-06-30 §
23:25 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
23:25 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:17 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:15 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:14 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:14 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
23:14 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:13 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
23:12 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:11 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
23:11 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:11 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
23:11 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:09 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
23:09 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:05 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
23:03 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
22:56 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
22:55 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
22:53 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
22:53 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
22:51 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
21:27 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow: apply [production]
21:27 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow: apply [production]
17:08 <_joe_> delete failing pod in eqiad for mw-api-ext, caused the backend errors page [production]
2024-06-29 §
01:24 <jclark@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1040.eqiad.wmnet with OS bullseye [production]
01:12 <jclark@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
01:10 <jclark@cumin1002> START - Cookbook sre.dns.netbox [production]
00:04 <jclark@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephosd1040.eqiad.wmnet with OS bullseye [production]