2020-02-21
§
|
11:39 |
<elukey> |
restart varnishkafka on cp3057 (stuck in timeouts to kafka, analytics alarms raised) |
[production] |
11:37 |
<arturo> |
[codfw1dev] cleanup unused neutron subnet pools from previous address scope tests (T244851) |
[admin] |
11:21 |
<godog> |
bounce logstash on logstash1023 - see if can catch up with elastic7 kafka lag |
[production] |
11:14 |
<elukey> |
reboot stat1005 - GPU blocked at 100% after issue with tensorflow |
[production] |
09:18 |
<akosiaris> |
depool mathoid in eqiad for a test |
[production] |
09:18 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=mathoid |
[production] |
08:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1107 after 10.4 testing - T242702', diff saved to https://phabricator.wikimedia.org/P10473 and previous config saved to /var/cache/conftool/dbconfig/20200221-085405-marostegui.json |
[production] |
08:34 |
<fdans@deploy1001> |
Finished deploy [analytics/refinery@4d56021]: deploying refinery (duration: 14m 55s) |
[production] |
08:19 |
<fdans@deploy1001> |
Started deploy [analytics/refinery@4d56021]: deploying refinery |
[production] |
08:19 |
<fdans> |
deploying refinery |
[analytics] |
08:02 |
<akosiaris> |
disable mod_remoteip on otrs host, following merge of https://gerrit.wikimedia.org/r/573877 |
[production] |
06:58 |
<marostegui> |
Stop MySQL on labsdb1012 to clone labsdb1011 - T245797 |
[production] |
06:58 |
<marostegui> |
Stop MySQL on labsdb1012 to clone labsdb1011 - |
[production] |
06:34 |
<marostegui> |
Stop mysql on es1024 to clone es1025 - T243052 |
[production] |
05:57 |
<marostegui> |
Start MySQL on labsdb1011 without replication - T245797 |
[production] |
05:44 |
<marostegui> |
Reload haproxy on dbproxy1010, dbproxy1011, dbproxy18 - T245797 |
[production] |
02:53 |
<bstorm_> |
depooled labsdb1011 and set weight 10 on labsdb1009 vs 3 on labsdb1010 T245797 |
[production] |
02:43 |
<ejegg> |
updated Fundraising CiviCRM from a6b222c19f to c086fd4e0b |
[production] |
02:27 |
<bstorm_> |
stopped mariadb on labsdb1011 because it keeps crashing anyway |
[production] |
01:05 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Sync Beta-Cluster-only change to CommonSettings now we're sure we won't revert (duration: 00m 56s) |
[production] |
01:04 |
<andrew@deploy1001> |
Finished deploy [horizon/deploy@13ca90a]: Remove guided puppet config mode; this gets us back to working with latest puppet packages. (duration: 03m 32s) |
[production] |
01:02 |
<James_F> |
Zuul: [operations/mediawiki-config] Run tox-docker always |
[releng] |
01:01 |
<andrew@deploy1001> |
Started deploy [horizon/deploy@13ca90a]: Remove guided puppet config mode; this gets us back to working with latest puppet packages. |
[production] |
00:59 |
<James_F> |
Zuul: [operations/mediawiki-config] Run tox-docker as experimental |
[releng] |
00:18 |
<andrewbogott> |
temporarily shutting down cloud-puppetmaster-01 and -02 as part of debugging the new puppetmasters |
[cloudinfra] |
00:11 |
<joal> |
Rerun failed wikidata-json_entity-weekly-coord instances after having created the missing hive table |
[analytics] |
2020-02-20
§
|
23:50 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: T245787 [nlwiki] Add noindex for NS_USER and NS_USER_TALK (duration: 00m 56s) |
[production] |
23:46 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Stop setting wgVectorPrintLogo for back-compat., not read since wmf.19 (duration: 00m 56s) |
[production] |
23:45 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw232[0-4].codfw.wmnet |
[production] |
23:45 |
<mutante> |
gerrit1002 - test VM - rebooting for new disk |
[production] |
23:36 |
<wm-bot> |
<lucaswerkmeister> deployed 1f063050d9 (many code cleanups including some minor bugfixes) |
[tools.quickcategories] |
23:33 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw231[7-9].codfw.wmnet |
[production] |
23:33 |
<dzahn@cumin1001> |
conftool action : set/weight=15; selector: name=mw232[0-4].codfw.wmnet |
[production] |
23:32 |
<dzahn@cumin1001> |
conftool action : set/weight=15; selector: name=mw231[7-9].codfw.wmnet |
[production] |
23:32 |
<dzahn@cumin1001> |
conftool action : set/weight=15; selector: name=mw2381[7-9].codfw.wmnet |
[production] |
23:25 |
<mutante> |
ganeti1003 - adding another virtual 20G disk to gerrit1002 (T243808) |
[production] |
23:14 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
23:12 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
23:04 |
<jforrester@deploy1001> |
Synchronized php-1.35.0-wmf.20/includes/pager/IndexPager.php: IndexPager: Limit offset params to the max of the indices available (duration: 00m 56s) |
[production] |
23:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
22:59 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
22:45 |
<Krenair> |
Swapped 185.15.56.64 floating IP (backing puppetmaster.cloudinfra.wmflabs.org) over to cloud-puppetmaster-03 from cloud-puppetmaster-01 |
[cloudinfra] |
22:28 |
<ebernhardson> |
restart mjolnir-kafka-bulk-daemon across eqiad |
[production] |
22:28 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
22:28 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
22:28 |
<ebernhardson@deploy1001> |
Finished deploy [search/mjolnir/deploy@8908dd1]: daemons: Install stack printing signal handler on SIGUSR1 (duration: 05m 05s) |
[production] |
22:23 |
<ebernhardson@deploy1001> |
Started deploy [search/mjolnir/deploy@8908dd1]: daemons: Install stack printing signal handler on SIGUSR1 |
[production] |
22:08 |
<thcipriani> |
killing a whole bunch of backed-up beta-scap-eqiad jobs |
[releng] |
21:50 |
<James_F> |
deployment-deploy01 back online; Jenkins backlog clearing for beta cluster jobs. |
[releng] |
21:47 |
<James_F> |
Taking deployment-deploy01 temporarily offline |
[releng] |