7351-7400 of 10000 results (35ms)
2021-03-05 ยง
20:23 <James_F> Disabling deployment-memc07 on the grounds that it's an unreferenced Jessie box we don't want any more T250585 [releng]
20:15 <legoktm@deploy1002> conftool action : set/pooled=no; selector: name=registry2002.codfw.wmnet [production]
20:15 <legoktm@deploy1002> conftool action : set/pooled=no; selector: name=registry2001.codfw.wmnet [production]
20:12 <legoktm@deploy1002> conftool action : set/pooled=yes; selector: name=registry2004.codfw.wmnet [production]
20:04 <legoktm@deploy1002> conftool action : set/weight=10; selector: name=registry2004.codfw.wmnet [production]
20:04 <legoktm@deploy1002> conftool action : set/pooled=no; selector: name=registry2004.codfw.wmnet [production]
20:02 <legoktm@deploy1002> conftool action : set/pooled=no; selector: name=registry2004.codfw.wmnet [production]
19:36 <Majavah> release deployment-prep floating ip 185.15.56.7, was used for mailman upgrade which is now on its own project [releng]
19:30 <Majavah> shutdown deployment-etcd-01 to see if anything breaks, will delete if nothing has broken during next week T276462 [releng]
19:30 <legoktm@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host registry2004.codfw.wmnet [production]
19:14 <Majavah> beta cluster etcd was switched from deployment-etcd-01 to deployment-etcd02 ref T276462 [releng]
19:14 <legoktm@cumin1001> START - Cookbook sre.ganeti.makevm for new host registry2004.codfw.wmnet [production]
19:04 <mutante> phab1001 - running public_task_dump.py (from cron job) manually [production]
18:50 <legoktm@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts registry2004.eqiad.wmnet [production]
18:45 <legoktm@cumin1001> START - Cookbook sre.hosts.decommission for hosts registry2004.eqiad.wmnet [production]
18:45 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1021.eqiad.wmnet with reason: REIMAGE [production]
18:43 <razzi@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1021.eqiad.wmnet with reason: REIMAGE [production]
18:30 <razzi> run again sudo -i wmf-auto-reimage-host -p T269211 clouddb1021.eqiad.wmnet --new [analytics]
18:23 <razzi@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
18:18 <razzi> sudo cookbook sre.dns.netbox -t T269211 "Move clouddb1021 to private vlan" [analytics]
18:18 <razzi@cumin1001> START - Cookbook sre.dns.netbox [production]
18:17 <razzi> re-run interface_automation.ProvisionServerNetwork with private vlan [analytics]
18:16 <razzi> delete non-mgmt interface for clouddb1021 [analytics]
17:50 <Majavah> switch deployment-prep hiera key etcd_host to use deployment-etcd02 ref T276462 [releng]
17:07 <razzi> sudo -i wmf-auto-reimage-host -p T269211 clouddb1021.eqiad.wmnet --new [analytics]
16:58 <razzi@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:54 <razzi> sudo cookbook sre.dns.netbox -t T269211 "Reimage and rename labsdb1012 to clouddb1021" [analytics]
16:54 <effie> depool mw1276 and pool back [production]
16:53 <razzi@cumin1001> START - Cookbook sre.dns.netbox [production]
16:52 <razzi> run script at https://netbox.wikimedia.org/extras/scripts/interface_automation.ProvisionServerNetwork/ [analytics]
16:48 <razzi> edit https://netbox.wikimedia.org/dcim/devices/2078/ device name from labsdb1012 to clouddb1021 [production]
16:47 <razzi> edit https://netbox.wikimedia.org/dcim/devices/2078/ device name from labsdb1012 to clouddb1021 [analytics]
16:36 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1036.eqiad.wmnet [production]
16:30 <razzi> delete non-mgmt interfaces for labsdb1012 at https://netbox.wikimedia.org/dcim/devices/2078/interfaces/ [production]
16:30 <razzi> delete non-mgmt interfaces for labsdb1012 at https://netbox.wikimedia.org/dcim/devices/2078/interfaces/ [analytics]
16:28 <razzi> rename https://netbox.wikimedia.org/ipam/ip-addresses/734/ DNS name from labsdb1012.mgmt.eqiad.wmnet to clouddb1021.mgmt.eqiad.wmnet [production]
16:28 <razzi> rename https://netbox.wikimedia.org/ipam/ip-addresses/734/ DNS name from labsdb1012.mgmt.eqiad.wmnet to clouddb1021.mgmt.eqiad.wmnet [analytics]
16:23 <arturo> rebooting cloudvirt1036 for T275753 [admin]
16:22 <aborrero@cumin1001> START - Cookbook sre.hosts.reboot-single for host cloudvirt1036.eqiad.wmnet [production]
16:22 <arturo> briefly rebooting traffic-cache-atsupload-buster because reboot of the hypervisor cloudvirt1036 [traffic]
16:17 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts labsdb1012.eqiad.wmnet [production]
16:11 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1086.eqiad.wmnet with reason: REIMAGE [production]
16:09 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1086.eqiad.wmnet with reason: REIMAGE [production]
16:08 <razzi> sudo cookbook sre.hosts.decommission labsdb1012.eqiad.wmnet -t T269211 [analytics]
16:07 <razzi@cumin1001> START - Cookbook sre.hosts.decommission for hosts labsdb1012.eqiad.wmnet [production]
15:56 <razzi> stop mariadb on labsdb1012 to reimage and rename to clouddb1021: T269211 [production]
15:52 <razzi> stop mariadb on labsdb1012 [analytics]
15:39 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on analytics1073.eqiad.wmnet with reason: REIMAGE [production]
15:39 <razzi> rebalance kafka partitions for webrequest_upload partition 10 [analytics]
15:38 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]