2251-2300 of 10000 results (26ms)
2020-04-20 §
08:19 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1081', diff saved to https://phabricator.wikimedia.org/P11018 and previous config saved to /var/cache/conftool/dbconfig/20200420-081911-marostegui.json [production]
08:16 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1089', diff saved to https://phabricator.wikimedia.org/P11017 and previous config saved to /var/cache/conftool/dbconfig/20200420-081623-marostegui.json [production]
08:14 <marostegui> Remove img_deleted column from db1089 (enwiki), db1081 (commonswiki, db1111 (wikidatawiki) - T250055 [production]
08:09 <jynus> restarting s3 instance on db1095 to reduce its buffer pool T250602 [production]
07:22 <_joe_> restarting php-fpm on the eqiad appservers to pick up the new max_execution_time [production]
07:20 <marostegui> Re add tl_namespace index to db1104 and db1092 - T250060 [production]
06:44 <moritzm> installing python2.7 security updates on jessie [production]
06:41 <elukey> execute find -mtime +30 -delete in /var/log/airflow/scheduler on an-airflow1001 to free space [production]
06:25 <moritzm> installing libxdmcp security updates on jessie [production]
06:16 <moritzm> installing bash updates on jessie [production]
05:54 <vgutierrez> rolling restart of ats-tls in cp[3052,3054,3056,3058,3060,4028,4029,4030,4031,4032] - T249335 [production]
05:53 <marostegui> Deploy schema change on s8 eqiad hosts T250060 [production]
05:50 <marostegui> Deploy schema change on s8 codfw - lag will show up T250060 [production]
04:55 <ariel@deploy1001> Finished deploy [dumps/dumps@b813c8a]: no private table dumps, check for existence of 7z,bz2 page content files before dumping, various unit tests (duration: 00m 04s) [production]
04:55 <ariel@deploy1001> Started deploy [dumps/dumps@b813c8a]: no private table dumps, check for existence of 7z,bz2 page content files before dumping, various unit tests [production]
2020-04-19 §
16:19 <reedy@deploy1001> Synchronized wmf-config/LabsServices.php: labs: Move RB traffic to new stretch host (duration: 01m 11s) [production]
16:05 <vgutierrez> rolling restart of ats-tls in text@esams - T249335 [production]
05:51 <marostegui> Power back on db1140 T250602 [production]
2020-04-18 §
22:50 <addshore> pool wdqs1006 blazegraph caught up T242453 [production]
20:30 <cdanis@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=wdqs,name=eqiad [production]
20:27 <thcipriani> restart gerrit-replica [production]
16:40 <dcausse> forcing replica count to 1 on some cloudelastic@chi indices [production]
15:13 <Amir1> applying schema change of T139090 on labswiki (wikitech) [production]
14:03 <cdanis@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=eqiad [production]
12:19 <addshore> restarting blazegraph on wdqs1006 blazegraph stuck T242453 [production]
12:15 <addshore> depool wdqs1006 blazegraph stuck T242453 [production]
12:15 <addshore> depool wdqs1006 blazegraph stuck [production]
06:07 <XioNoX> change OSPF metrics to prefer ulsfo tunnel transport [production]
2020-04-17 §
19:33 <Krinkle> Depool mw1407.eqiad.wmnet for opcache testing. Do not repool without first reverting https://gerrit.wikimedia.org/r/589674. [production]
19:32 <Krinkle> Depool mw1407.eqiad.wmnet for opcache and LCStoreStaticArray testing. – T99740 [production]
17:41 <cmjohnson1> replacing network cable pc1009 T250257 [production]
17:34 <cmjohnson1> moving msw1 to msw-c racks mounted switch cable ports from port 49 to port 50 [production]
17:22 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:22 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
16:15 <Urbanecm> Revert recent email change of User:CPHL@SUL's email [production]
16:05 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . [production]
16:05 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . [production]
15:52 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . [production]
15:52 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . [production]
15:48 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' . [production]
15:48 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' . [production]
15:42 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
15:41 <otto@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
15:20 <rzl> remove cronjobs from mwmaint1002 previously updated to systemd timers and erroneously left in crontab -- diffs: https://phabricator.wikimedia.org/P11012 T211250 [production]
14:29 <mutante> ganeti2001 - kileld and restarted gnt-rapi process with the correct new key and cert [production]
14:19 <cdanis> add peer AS29802 to cr2-eqdfw and cr2-esams [production]
14:01 <mutante> netbox1001 - netbox_ganeti_eqiad_synx / systemd state fixed after gnt-rapi is runnign again on ganeti1003 [production]
14:00 <mutante> ganeti1003 - fixing gnt-rapi daemon not running [production]
13:54 <mateusbs17> Running VACUUM FULL for gis DB in maps2004.codfw.wmnet (which is depooled at the moment) [production]
13:00 <mutante> netbox1001 - sudo systemctl start netbox_ganeti_eqiad_sync (was failed) [production]