1-50 of 10000 results (19ms)
2020-08-07 §
22:30 <bstorm> removing downtime for paws and front page monitor T211096 [paws]
21:21 <dpifke> Cherry picking latest version of https://gerrit.wikimedia.org/r/c/operations/puppet/+/601429, and enabling swift-object-expirer on deployment-ms-be05 via horizon hieradata. [releng]
20:22 <wm-bot> <lucaswerkmeister> deployed e7fd7f5c82 (support TIFF etc.) [tools.wd-image-positions]
18:01 <bstorm> shutting down paws-proxy-02 T211096 [paws]
17:49 <bstorm> shutting down the entire old cluster T211096 [tools.paws]
17:05 <bstorm> running the final rsync to the new cluster's nfs T211096 [paws]
16:42 <jforrester@deploy1001> Synchronized php-1.36.0-wmf.3/extensions/DiscussionTools/: T259855 Revert new reply API (duration: 01m 06s) [production]
16:08 <bstorm> changing paws.wmflabs.org to point at the new cluster ip 185.15.56.57 T211096 [paws]
16:02 <bstorm> LAST MESSAGE WRONG: switching NEW cluster to toolsdb T211096 [paws]
16:02 <bstorm> switching old cluster to toolsdb T211096 [paws]
15:58 <bstorm> switching old cluster to sqlite T211096 [paws]
15:53 <bstorm> downtiming alerts in case they need changes (seems likely) T211096 [paws]
15:25 <hashar> Reloaded Zuul for https://gerrit.wikimedia.org/r/618801 [releng]
15:01 <volans> import DNS names for network devices in Netbox - T258729 [production]
13:27 <godog> bounce pybal on lvs1016 and then lvs1015 to reset state, logstash1025 reported down but actually up [production]
10:27 <sukhe@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:27 <sukhe@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:02 <elukey> reboot deneb via ganeti2021 (hostname config pointing to recdns for some reason) [production]
09:15 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1092', diff saved to https://phabricator.wikimedia.org/P12195 and previous config saved to /var/cache/conftool/dbconfig/20200807-091527-marostegui.json [production]
08:47 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P12194 and previous config saved to /var/cache/conftool/dbconfig/20200807-084747-marostegui.json [production]
08:07 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P12193 and previous config saved to /var/cache/conftool/dbconfig/20200807-080719-marostegui.json [production]
07:50 <godog> prometheus codfw lvextend --resize --size +60G /dev/mapper/vg--hdd-prometheus--global [production]
07:49 <godog> prometheus codfw lvextend --resize --size +30G /dev/mapper/vg--ssd-prometheus--k8s [production]
07:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P12192 and previous config saved to /var/cache/conftool/dbconfig/20200807-074658-marostegui.json [production]
06:53 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
06:51 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
06:34 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1092 for upgrade', diff saved to https://phabricator.wikimedia.org/P12191 and previous config saved to /var/cache/conftool/dbconfig/20200807-063431-marostegui.json [production]
2020-08-06 §
23:46 <dpifke> Cherry-picking https://gerrit.wikimedia.org/r/c/operations/puppet/+/601429 in beta. [releng]
23:21 <catrope@deploy1001> Synchronized php-1.36.0-wmf.3/extensions/GrowthExperiments/: Fixes for WelcomeSurvey language question (T232410) (duration: 00m 59s) [production]
23:04 <catrope@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Change GrowthExperiments mentor list on fawiki (T253291) (duration: 00m 59s) [production]
21:43 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
21:41 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
21:40 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
21:39 <mholloway-shell@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
21:39 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
21:35 <mholloway-shell@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
21:33 <brennen@deploy1001> Synchronized php-1.36.0-wmf.3/vendor: [[gerrit:618850|Update git submodules (vendor)]] (T259832) (duration: 01m 08s) [production]
21:32 <mholloway-shell@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . [production]
21:02 <andrewbogott> removing cloudvirt1004/1006 from nova's list of hypervisors; rebuilding them to use as backup test hosts [admin]
20:51 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
20:51 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
20:47 <shdubsh> restart logstash -- pipeline appears stuck [production]
20:38 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [production]
20:38 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [production]
20:19 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [production]
20:19 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [production]
20:19 <brennen> manually updating the vendor submodule on 1.36.0 for T259832 [production]
20:15 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
20:15 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
20:06 <bstorm> manually stopped the RAID check on cloudcontrol1003 T259760 [admin]