2020-01-07 §
10:02 <arturo> icinga downtime cloudvirt1009 for 30 minutes to re-create canary VM (T242078) [admin]
2020-01-06 §
13:45 <andrewbogott> restarting nova-api and nova-conductor on cloudcontrol1003 and 1004 [admin]
2020-01-04 §
16:34 <arturo> icinga downtime cloudvirt1024 for 2 months because hardware errors (T241884) [admin]
2019-12-31 §
11:46 <andrewbogott> I couldn't! [admin]
11:39 <andrewbogott> restarting cloudservices2002-dev to see if I can reproduce an issue I saw earlier [admin]
2019-12-25 §
10:13 <arturo> icinga downtime for 30 minutes the whole cloud* lab* fleet to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/560575 (will restart some openstack components) [admin]
2019-12-24 §
15:13 <arturo> icinga downtime all the lab* fleet for nova password change for 1h [admin]
14:39 <arturo> icinga downtime all the cloud* fleet for nova password change for 1h [admin]
2019-12-23 §
11:13 <arturo> enable puppet in cloudcontrol1003/1004 [admin]
10:40 <arturo> disable puppet in cloudcontrol1003/1004 while doing changes related to python-ldap [admin]
2019-12-22 §
23:48 <andrewbogott> restarting nova-conductor and nova-api on cloudcontrol1003 and 1004 [admin]
09:45 <arturo> cloudvirt1013 is back (did it alone) T241313 [admin]
09:37 <arturo> cloudvirt1013 is down for good. Apparently powered off. I can't even reach it via iLO [admin]
2019-12-20 §
12:43 <arturo> icinga downtime cloudmetrics1001 for 128 hours [admin]
2019-12-18 §
12:55 <arturo> [codfw1dev] created a new subnet neutron object to hold the new CIDR for floating IPs (cloud-codfw1dev-floating - T239347 [admin]
2019-12-17 §
07:21 <andrewbogott> deploying horizon/train to labweb1001/1002 [admin]
2019-12-12 §
06:11 <arturo> schedule 4h downtime for labstores [admin]
05:57 <arturo> schedule 4h downtime for cloudvirts and other openstack components due to upgrade ops [admin]
2019-12-02 §
06:28 <andrewbogott> running nova-manage db sync on eqiad1 [admin]
06:27 <andrewbogott> running nova-manage cell_v2 map_cell0 on eqiad1 [admin]
2019-11-21 §
16:07 <jeh> created replica indexes and views for szywiki T237373 [admin]
15:48 <jeh> creating replica indexes and views for shywiktionary T238115 [admin]
15:48 <jeh> creating replica indexes and views for gcrwiki T238114 [admin]
15:46 <jeh> creating replica indexes and views for minwiktionary T238522 [admin]
15:36 <jeh> creating replica indexes and views for gewikimedia T236404 [admin]
2019-11-18 §
19:27 <andrewbogott> repooling labsdb1011 [admin]
18:54 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1011 T238480 [admin]
18:44 <andrewbogott> depooling labsdb1011 and killing remaining user queries T238480 [admin]
18:42 <andrewbogott> repooled labsdb1009 and 1010 T238480 [admin]
18:19 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1010 T238480 [admin]
18:18 <andrewbogott> depooling labsdb1010, killing remaining user queries [admin]
17:46 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1009 T238480 [admin]
17:38 <andrewbogott> depooling labsdb1009, killing remaining user queries [admin]
16:54 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1012 T237509 [admin]
2019-11-15 §
20:04 <andrewbogott> repool labdb1011 (T237509) [admin]
19:29 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1011 [admin]
19:25 <andrewbogott> depooling labsdb1011, killing remaining queries [admin]
19:25 <andrewbogott> repooling labsdb1010 [admin]
18:59 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1012 [admin]
18:57 <andrewbogott> running maintain-views --all-databases --replace-all —clean on labsdb1010 [admin]
18:54 <andrewbogott> depooling labsdb1010, killing remaining user queries [admin]
18:54 <andrewbogott> depooled labsdb1009, ran maintain-views —clean —all-databases —replace-all, repooled [admin]
2019-11-11 §
13:10 <arturo> cloudweb2001-dev: disable puppet and redirect stderr in the loadExitNodes.php cron script to prevent cronspam while we investigate the cause of the issue (T237971) [admin]
2019-11-05 §
11:59 <arturo> icinga downtime for 1h cloudcontrol1004, cloudnet1003, cloudvirt1017/1020/1022 for PDU operations in the rack T227542 [admin]
2019-11-04 §
21:55 <andrewbogott> deleting a ton of wikitech hiera pages that were either no-ops or refer to nonexistent VMs or prefixes [admin]
2019-10-31 §
11:01 <arturo> icinga-downtimed cloudvirt1030 and cloudservices1003 for 1h due to PDU upgrade operations T227543 [admin]
2019-10-30 §
22:43 <jeh> reboot cloud-bootstrapvz-stretch to resolve bad bootstrapvz build [admin]
2019-10-29 §
10:52 <arturo> icinga downtime cloudvirt1001/1002/1024/1018/1012/1009/1015/1008 for 1h T227538 [admin]
2019-10-25 §
10:45 <arturo> icinga downtime toolschecker for 1 to upgrade clouddb1002 mariadb (toolsdb secondary) (T236384 , T236420) [admin]
2019-10-24 §
12:30 <arturo> starting cloudvirt1019, PDU operations ended (T227540) [admin]