151-200 of 573 results (12ms)
2020-06-18 §
20:38 <andrewbogott> rebooting cloudservices2003-dev due to a mysterious 'host down' alert on a secondary ip [admin]
2020-06-16 §
15:38 <arturo> created by hand neutron port 9c0a9a13-e409-49de-9ba3-bc8ec4801dbf `paws-haproxy-vip` (T295217) [admin]
2020-06-12 §
13:23 <arturo> DNS zone `paws.wmcloud.org` transferred to the PAWS project (T195217) [admin]
13:20 <arturo> created DNS zone `paws.wmcloud.org` (T195217) [admin]
2020-06-11 §
19:19 <bstorm_> proceeding with failback to labstore1004 now that DRBD devices are consistent T224582 [admin]
17:22 <bstorm_> delaying failback labstore1004 for drive syncs T224582 [admin]
17:17 <bstorm_> failing NFS back to labstore1004 to complete the upgrade process T224582 [admin]
16:15 <bstorm_> failing over NFS for labstore1004 to labstore1005 T224582 [admin]
2020-06-10 §
16:09 <andrewbogott> deleting all old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate database T254496 [admin]
2020-06-09 §
15:25 <arturo> icinga downtime everything cloud* lab* for 2h more (T253780) [admin]
14:09 <andrewbogott> stopping puppet, all designate services and all pdns services on cloudservices1004 for T253780 [admin]
14:01 <arturo> icinga downtime everything cloud* lab* for 2h (T253780) [admin]
2020-06-05 §
15:08 <andrewbogott> trying to re-enable puppet without losing cumin contact, as per https://phabricator.wikimedia.org/T254589 [admin]
2020-06-04 §
14:24 <andrewbogott> disabling puppet on all instances for /labs/private recovery [admin]
14:23 <arturo> disabling puppet on all instances for /labs/private recovery [admin]
2020-05-28 §
23:02 <bd808> `/usr/local/sbin/maintain-dbusers --debug harvest-replicas` (T253930) [admin]
13:36 <andrewbogott> rebuilding cloudservices2002-dev with Buster [admin]
00:33 <andrewbogott> shutting down cloudservices2002-dev to see if we can live without it. This is in anticipation or rebuilding it entirely for T253780 [admin]
2020-05-27 §
23:29 <andrewbogott> disabling the backup job on cloudbackup2001 (just like last week) so the backup doesn't start while Brooke is rebuilding labstore1004 tomorrow. [admin]
06:03 <bd808> `systemctl start mariadb` on clouddb1001 following reboot (take 2) [admin]
05:58 <bd808> `systemctl start mariadb` on clouddb1001 following reboot [admin]
05:53 <bd808> Hard reboot of clouddb1001 via Horizon. Console unresponsive. [admin]
2020-05-25 §
16:35 <arturo> [codfw1dev] created zone `0-29.57.15.185.in-addr.arpa.` (T247972) [admin]
2020-05-21 §
19:23 <andrewbogott> disabling puppet on cloudbackup2001 to prevent the backup job from starting during maintenance [admin]
19:16 <andrewbogott> systemctl disable block_sync-tools-project.service on cloudbackup2001.codfw.wmnet to avoid stepping on current upgrade [admin]
15:48 <andrewbogott> re-imaging cloudnet1003 with Buster [admin]
2020-05-19 §
22:59 <bd808> `apt-get install mariadb-client` on cloudcontrol1003 [admin]
21:12 <bd808> Migrating wcdo.wcdo.eqiad.wmflabs to cloudvirt1023 (T251065) [admin]
2020-05-18 §
21:37 <andrewbogott> rebuilding cloudnet2003-dev with Buster [admin]
2020-05-15 §
22:10 <bd808> Added reedy as projectadmin in cloudinfra project (T249774) [admin]
22:05 <bd808> Added reedy as projectadmin in admin project (T249774) [admin]
18:44 <bstorm_> rebooting cloudvirt-wdqs1003 T252831 [admin]
15:47 <bd808> Manually running wmcs-novastats-dnsleaks from cloudcontrol1003 (T252889) [admin]
2020-05-14 §
23:28 <bstorm_> downtimed cloudvirt1004/6 and cloudvirt-wdqs1003 until tomorrow around this time T252831 [admin]
22:21 <bstorm_> upgrading qemu-system-x86 on cloudvirt1006 to backports version T252831 [admin]
22:15 <bstorm_> changing /etc/libvirt/qemu.conf and restarting libvirtd on cloudvirt1006 T252831 [admin]
21:12 <andrewbogott> rebuilding cloudvirt1003-wdqs as part of T252831 [admin]
15:47 <andrewbogott> moving cloudvirt1004 and cloudvirt1006 to the 'ceph' aggregate for T252784 [admin]
15:02 <andrewbogott> moving all of cloudvirt100[1-9] into the 'toobusy' host aggregate. These are slower, have spinning disks, and are due for replacement. [admin]
2020-05-12 §
20:33 <andrewbogott> moving cloudvirt1023 to the 'standard' pool and out of the 'spare' pool [admin]
19:10 <jeh> disable neutron-openvswitch-agent service on cloudvirt2001-dev.codfw T248881 [admin]
19:09 <jeh> Shutdown the unused eno2 network interface on cloudvirt2001-dev.codfw to clear up monitoring errors T248425 [admin]
18:20 <andrewbogott> moving cloudvirt1024 out of the 'maintenance' aggregate and into 'spare' [admin]
16:45 <andrewbogott> restarting neutron-l3-agent on cloudnet1004 so it knows about all three cloudcontrols. Leaving cloudnet1003 since restarting it there will cause network interruptions [admin]
14:06 <arturo> icinga downtime everything for 2h for Debian Buster migration in some cloud components [admin]
2020-05-09 §
16:53 <andrewbogott> rebuilding cloudcontrol2001-dev and 2003-dev with buster for T252121 [admin]
2020-05-08 §
19:02 <bstorm_> moving tools-k8s-haproxy-2 from cloudvirt1021 to cloudvirt1017 to improve spread [admin]
2020-05-05 §
13:58 <andrewbogott> rebuilding cloudcontrol2004-dev to test new puppet changes [admin]
2020-05-04 §
09:04 <arturo> [codfw1dev] manually modify iptables ruleset to only allow SSH from WMF bastions on cloudservices2003-dev and cloudcontrol2004-dev (T251604) [admin]
2020-04-21 §
22:12 <andrewbogott> moving cloudvirt1004 out of the 'standard' aggregate and into the 'maintenance' aggregate [admin]