901-950 of 10000 results (29ms)
2020-12-17 §
06:07 <Majavah> re-enable tasks that were disabled due to ToolsDB maintenance [tools.majavah-bot]
06:05 <marostegui@cumin1001> START - Cookbook sre.hosts.decommission [production]
05:56 <marostegui> Stop mysql on db1106 to clone db1154 [production]
05:55 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1106 for cloning db1154:3311 T268742 ', diff saved to https://phabricator.wikimedia.org/P13560 and previous config saved to /var/cache/conftool/dbconfig/20201217-055556-marostegui.json [production]
03:43 <brennen> Updating dev-images docker-pkg files on primary contint for fundraising buster base and php7.2-dba inclusion [releng]
02:22 <bstorm> Set PAWS hub back to using mariadb T266587 [paws]
02:14 <bstorm> toolsdb is back and so is the replica T266587 [clouddb-services]
01:35 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1019.eqiad.wmnet with reason: REIMAGE [production]
01:33 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1019.eqiad.wmnet with reason: REIMAGE [production]
01:10 <bstorm> the sync is done and we have a good copy of the toolsdb data, proceeding with the upgrades and stuff to that hypervisor while configuring replication to work again T266587 [clouddb-services]
01:01 <twentyafterfour> preparing to update phabricator translations [production]
00:22 <mutante> running puppet on mw2266, mw2370, mw2354 [production]
2020-12-16 §
23:56 <bstorm> bootstrapped meta_p database for the new s7 replicas T269427 [production]
23:48 <James_F> Zuul: Switch OOUI and Wikipeg over to versioned special jobs [releng]
23:40 <James_F> Docker: Publishing node10-test-browser-php80-composer 0.0.1 [releng]
22:00 <mutante> adjusted 'puppet prefix' deployment-jobrunner to use "role::beta::mediawiki::jobrunner" instead of "role::mediawiki::jobrunner" - goes together with gerrit:649707 - no instance currently exists called 'deployment-jobrunner' [releng]
22:00 <mutante> adjusted 'puppet prefix' deployment-jobrunner to use "role::beta::mediawiki::jobrunner" instead of "role::mediawiki::jobrunner" - goes together with gerrit:649707 - no instance currently exists called 'deployment-jobrunner' [deployment-prep]
21:06 <joal> Kill-restart virtualpageview-hourly-coord and projectview-geo-coord with manually updated jar versions (old versions in conf) [analytics]
20:12 <marxarelli> group1 to 1.36.0-wmf.22 complete. no new errors or concerning rates (refs T267415) [production]
20:06 <dduvall@deploy1001> Synchronized php: group1 wikis to 1.36.0-wmf.22 (duration: 01m 01s) [production]
20:05 <legoktm> added myself to the ops LDAP group [production]
20:05 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.22 [production]
19:35 <joal> Kill-restart all oozie jobs belonging to analytics except mediawiki-wikitext-history-coord [analytics]
19:23 <dcausse> Morning backport window deploy done [production]
19:21 <dcausse@deploy1001> Synchronized php-1.36.0-wmf.22/extensions/WikimediaEvents/: T266027: Revert [cirrus] setup perfield builder A/B test on spaceless languages (duration: 01m 03s) [production]
19:18 <dcausse@deploy1001> Synchronized php-1.36.0-wmf.21/extensions/WikimediaEvents/: T266027: Revert [cirrus] setup perfield builder A/B test on spaceless languages (duration: 01m 03s) [production]
19:09 <dcausse@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T266359: wgMinervaCountErrors config was removed (duration: 01m 03s) [production]
18:52 <joal> Kill-restart cassandra loading oozie jobs [analytics]
18:37 <joal> Kill-restart wikidata-entity, wikidata-item_page_link and mobile_apps-session_metrics oozie jobs [analytics]
18:34 <bstorm> restarted sync from toolsdb to its replica server after cleanup to prevent disk filling T266587 [clouddb-services]
18:31 <joal> Kill-rerun data-quality bundles [analytics]
18:21 <chicocvenancio> move paws to sqlite while toolsdb is down. [paws]
17:52 <effie> uploading python-thumbor-wikimedia_2.9-1 to stretch-wikimedia/component/thumbor [production]
17:31 <bstorm> sync started from toolsdb to its replica server T266587 [clouddb-services]
17:29 <bstorm> stopped mariadb on the replica T266587 [clouddb-services]
17:28 <bstorm> shutdown toolsdb T266587 [clouddb-services]
17:26 <Majavah> disable crontabs for tasks needing ToolsDB access due to maintenance [tools.majavah-bot]
17:24 <bstorm> settings toolsdb to readonly to prepare for shutdown T266587 [clouddb-services]
17:06 <bstorm> switching the secondary config back to clouddb1002 in order to minimize concerns about affecting ceph performance T266587 [clouddb-services]
16:44 <wm-bot> <lucaswerkmeister> deployed e236b28c74 (error handler for upcoming ToolsDB maintenance) [tools.quickcategories]
16:40 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2019.codfw.wmnet with reason: REIMAGE [production]
16:38 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc2019.codfw.wmnet with reason: REIMAGE [production]
16:38 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1019.eqiad.wmnet with reason: REIMAGE [production]
16:36 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc1019.eqiad.wmnet with reason: REIMAGE [production]
16:32 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
16:32 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
16:32 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'sessionstore' for release 'staging' . [production]
16:32 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
16:23 <gehel@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
16:17 <razzi> dropping and re-creating superset staging database [analytics]