7201-7250 of 10000 results (40ms)
2021-10-04 §
07:37 <joal@deploy1002> Finished deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] (duration: 06m 14s) [production]
07:32 <joal> Deploy refinery to hdfs [analytics]
07:31 <joal@deploy1002> Started deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] [production]
07:30 <joal@deploy1002> Finished deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] (duration: 00m 06s) [production]
07:30 <joal@deploy1002> Started deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] [production]
07:29 <joal@deploy1002> Finished deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] (duration: 19m 18s) [production]
07:19 <dcausse> restarting blazegraph on wdqs2001 & wdqs2004 (allocators burning too quickly) [production]
07:18 <elukey> depool + restart blazegraph + restart updater for wdqs1006 [production]
07:18 <elukey@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=wdqs1006.wmnet [production]
07:18 <elukey@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=wdqs1004.wmnet [production]
07:10 <joal@deploy1002> Started deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] [production]
07:10 <joal> Deploy refinery for mediawiki-history-reduced hotfix [analytics]
07:02 <godog> swift eqiad-prod: add weight to ms-be10[64-67] - T290546 [production]
06:56 <joal> Kill-restart pageview-monthly_dump-coord to apply fix for SLA [analytics]
06:44 <elukey> depool + restart blazegraph + restart updater on wdqs1004 [production]
05:50 <ladsgroup@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . [production]
05:49 <ladsgroup@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . [production]
05:47 <ladsgroup@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . [production]
2021-10-03 §
21:30 <bstorm> rebuilding buster containers since they are also affected T291387 T292355 [tools]
21:29 <bstorm> rebuilt stretch containers for potential issues with LE cert updates T291387 [tools]
18:35 <chicocvenancio> building python 2.7.18 to use in python 3.9 container T292355 [tools.video2commons]
17:28 <Operator873|CVN> restarted bots 5, 12, 28, and 29 failed to regain nick. [cvn]
14:45 <_joe_> restarting acmechief on acmechief1001 [production]
12:55 <kormat@cumin1001> dbctl commit (dc=all): 'Depool db1127, bad ram', diff saved to https://phabricator.wikimedia.org/P17414 and previous config saved to /var/cache/conftool/dbconfig/20211003-125530-kormat.json [production]
12:02 <majavah> update to python 3.9, after it broke when due to recent LE changes, was using python 3.4 / jessie [tools.sge-jobs]
08:24 <elukey> powercycle cp5006 (unresponsive to ssh, remote tty available but not able to login as root, no prometheus metrics in hours) [production]
08:23 <elukey@puppetmaster1001> conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet [production]
2021-10-02 §
21:31 <Krinkle> krinkle@cvn-app8 Idem [cvn]
21:29 <Krinkle> krinkle@cvn-app9 `sudo sed -i 's#mozilla/DST_Root_CA_X3.crt#!mozilla/DST_Root_CA_X3.crt#' /etc/ca-certificates.conf && sudo update-ca-certificates` ref T292289, ref https://github.com/mono/mono/issues/21233 [cvn]
21:24 <Krinkle> /cs flags #cvn-wp-es LuchoCR local_op ; verified nick and sysop at es.wikipedia [cvn]
21:11 <Krinkle> /cs flags #cvn-wp-en tn local_op [cvn]
21:04 <Krinkle> /cs flags #cvn-wp-en tn voiced - verified nick [cvn]
17:28 <bd808@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
16:10 <bd808@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
2021-10-01 §
23:19 <bd808@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
22:27 <mutante> puppetmaster2001 - systemctl reset-failed [production]
22:16 <mutante> puppetmaster2001 systemctl disable geoip_update_ipinfo.timer [production]
22:15 <mutante> puppetmaster2001 - sudo /usr/local/bin/geoipupdate_job after adding new shell command and timer - succesfully downloaded enterprise database for T288844 [production]
21:59 <bd808> clush -w @all -b 'sudo sed -i "s#mozilla/DST_Root_CA_X3.crt#!mozilla/DST_Root_CA_X3.crt#" /etc/ca-certificates.conf && sudo update-ca-certificates' for T292289 [tools]
21:56 <bd808@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
21:44 <mutante> puppetmasters - temp. disabling puppet one more time, now for a different deploy, to fetch an additional MaxMind database - T288844 [production]
21:19 <mutante> puppetmaster2001 - puppet removed cron sync_volatile and cron sync_ca - starting and verifying new timers: 'systemctl status sync-puppet-volatile', 'systemctl status sync-puppet-ca' T273673 [production]
21:12 <mutante> puppetmaster1002, puppetmaster1003, puppetmaster2002, puppetmaster2003: re-enabled puppet, they are backends. backends don't have the sync cron/job/timer, so noop as well, just like 1004/1005/2004/2005. this just leaves the actual change on 2001 - T273673 [production]
21:07 <mutante> puppetmaster1004, puppetmaster1005, puppetmaster2004, puppetmaster2005: re-enabled puppet, they are "insetup" role [production]
21:06 <mbsantos@deploy1002> Finished deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend (duration: 00m 54s) [production]
21:05 <mbsantos@deploy1002> Started deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend [production]
21:05 <mutante> puppetmaster1001 - re-enabled puppet, noop as expected, the passive host pulls from the active one, so only 2001 has the cron/job/timer [production]
21:05 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
21:02 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
21:01 <legoktm@deploy1002> Synchronized wmf-config/CommonSettings.php: Revert "Have PdfHandler use Shellbox on Commons for 10% of requests" (duration: 00m 59s) [production]