2651-2700 of 10000 results (53ms)
2022-03-25 §
08:24 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance [production]
08:04 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1174 (T302658)', diff saved to https://phabricator.wikimedia.org/P23065 and previous config saved to /var/cache/conftool/dbconfig/20220325-080403-marostegui.json [production]
08:04 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance [production]
08:04 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance [production]
08:03 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1127 (T302658)', diff saved to https://phabricator.wikimedia.org/P23064 and previous config saved to /var/cache/conftool/dbconfig/20220325-080355-marostegui.json [production]
08:02 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance [production]
08:02 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance [production]
07:56 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [production]
07:56 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [production]
07:56 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance [production]
07:56 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance [production]
07:56 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23063 and previous config saved to /var/cache/conftool/dbconfig/20220325-075610-ladsgroup.json [production]
07:48 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P23062 and previous config saved to /var/cache/conftool/dbconfig/20220325-074850-marostegui.json [production]
07:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23061 and previous config saved to /var/cache/conftool/dbconfig/20220325-074105-ladsgroup.json [production]
07:33 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P23060 and previous config saved to /var/cache/conftool/dbconfig/20220325-073345-marostegui.json [production]
07:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23059 and previous config saved to /var/cache/conftool/dbconfig/20220325-072559-ladsgroup.json [production]
07:18 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1127 (T302658)', diff saved to https://phabricator.wikimedia.org/P23058 and previous config saved to /var/cache/conftool/dbconfig/20220325-071840-marostegui.json [production]
07:10 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23057 and previous config saved to /var/cache/conftool/dbconfig/20220325-071054-ladsgroup.json [production]
06:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23056 and previous config saved to /var/cache/conftool/dbconfig/20220325-064139-ladsgroup.json [production]
06:41 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [production]
06:41 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [production]
06:31 <_joe_> deleting a couple zotero pods with excessive number of restarts [production]
06:29 <marostegui> dbmaint s4@eqiad T300775 [production]
06:07 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1112 for schema change', diff saved to https://phabricator.wikimedia.org/P23055 and previous config saved to /var/cache/conftool/dbconfig/20220325-060723-marostegui.json [production]
05:47 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1127 (T302658)', diff saved to https://phabricator.wikimedia.org/P23054 and previous config saved to /var/cache/conftool/dbconfig/20220325-054705-marostegui.json [production]
05:47 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance [production]
05:46 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance [production]
05:30 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1134 for testing', diff saved to https://phabricator.wikimedia.org/P23053 and previous config saved to /var/cache/conftool/dbconfig/20220325-053037-marostegui.json [production]
00:39 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2027.codfw.wmnet with OS buster [production]
2022-03-24 §
23:57 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host restbase2027.codfw.wmnet with OS buster [production]
23:04 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase2027.mgmt.codfw.wmnet with reboot policy FORCED [production]
22:30 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T302658)', diff saved to https://phabricator.wikimedia.org/P23050 and previous config saved to /var/cache/conftool/dbconfig/20220324-223031-marostegui.json [production]
22:19 <pt1979@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1047.eqiad.wmnet with OS bullseye [production]
22:15 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P23049 and previous config saved to /var/cache/conftool/dbconfig/20220324-221526-marostegui.json [production]
22:14 <pt1979@cumin2002> START - Cookbook sre.hosts.provision for host restbase2027.mgmt.codfw.wmnet with reboot policy FORCED [production]
22:10 <pt1979@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage [production]
22:07 <ebernhardson> restart wcqs-blazegraph on wcqs2001 to resolve intermittant BlazegraphFreeAllocatorsDecreasingRapidly [production]
22:06 <pt1979@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage [production]
22:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P23048 and previous config saved to /var/cache/conftool/dbconfig/20220324-220021-marostegui.json [production]
21:54 <pt1979@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye [production]
21:45 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T302658)', diff saved to https://phabricator.wikimedia.org/P23047 and previous config saved to /var/cache/conftool/dbconfig/20220324-214515-marostegui.json [production]
21:42 <pt1979@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1047.eqiad.wmnet with OS bullseye [production]
21:38 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
21:33 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
21:13 <pt1979@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye [production]
21:13 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
21:12 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
21:12 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
21:11 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
21:11 <inflatador> bking@cumin1001 restarting blazegraph on wdqs[1003-1013].eqiad.wmnet for T293862 [production]