2016-05-01
§
|
20:51 |
<Luke081515> |
Updadateing repos & databases |
[rcm.cac] |
19:37 |
<SMalyshev> |
enabled wdqs1002, put wdqs1001 in maintenance mode for reload |
[production] |
16:20 |
<volans> |
changing live configuration of db1042 thread_pool_stall_limit to 10 to avoid connection timeout errors |
[production] |
16:18 |
<volans> |
changing live configuration of db1042 thread_pool_stall_limit back to 100 to test impact on connection timeout |
[production] |
16:08 |
<volans> |
changing live configuration of db1042 thread_pool_stall_limit to 10 to test impact on connection timout |
[production] |
15:24 |
<jynus> |
alter table puppet.fact_values to a bigint unsigned for m1 T107753 |
[production] |
15:07 |
<volans@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1040 for investigation T134114 (duration: 01m 22s) |
[production] |
14:44 |
<volans> |
truncated puppet.fact_values table to fix puppet (as documented on wikitech) |
[production] |
10:58 |
<godog> |
reboot furud.codfw.wmnet, ganeti instance with increasing load and 100% iowait, kvm/ganeti idle instance bug likely T134098 |
[production] |
2016-04-30
§
|
18:31 |
<Amir1> |
deploying d4f63a3 from github.com/wiki-ai/ores-wikimedia-config into targets in beta cluster via scap3 |
[releng] |
18:05 |
<Amir1> |
deployed d4f63a3 to web and worker nodes |
[ores] |
17:49 |
<Amir1> |
deploy d4f63a3 to the staging |
[ores] |
17:33 |
<Amir1> |
deploying 30ba552 to the staging |
[ores] |
14:54 |
<Amir1> |
running puppet agent manually in ores-web-03 |
[ores] |
14:52 |
<Amir1> |
added precaching role to ores-web-03 |
[ores] |
13:42 |
<elukey> |
disabled puppet on analytics1047 and scheduled downtime for the host, IO errors in the dmesg for /dev/sdd. Stopped also Hadoop daemons to remove it from the cluster temporarily (not sure how to do it properly, will write docs). |
[analytics] |
13:41 |
<elukey> |
disabled puppet on analytics1047 and scheduled downtime for the host, IO errors in the dmesg for /dev/sdd. Stopped also Hadoop daemons to remove it from the cluster temporarily (not sure how to do it properly, will write docs). |
[production] |
10:45 |
<volans> |
Reset slave on sanitarium:3311 due to corrupted relay log after skipping query for duplicate key T132416 |
[production] |
10:19 |
<volans> |
restarted slave on dbstore1001 skipping missing database T132837 |
[production] |
08:28 |
<gehel> |
restarting elasticsearch server elastic1031.eqiad.wmnet (T110236) |
[production] |
07:15 |
<gehel> |
restarting elasticsearch server elastic1030.eqiad.wmnet (T110236) |
[production] |
06:32 |
<gehel> |
restarting elasticsearch server elastic1029.eqiad.wmnet (T110236) |
[production] |
06:16 |
<gehel> |
restarting elasticsearch server elastic1028.eqiad.wmnet (T110236) |
[production] |
04:20 |
<Amir1> |
deploying 0679024 to the deploy branch |
[wikilabels] |
04:12 |
<Amir1> |
deployed 0679024 to the staging |
[wikilabels] |
01:15 |
<aude> |
applied Ibd302e1 to terbium for debugging broken wikidata rdf dumps |
[production] |
2016-04-29
§
|
22:57 |
<mutante> |
DNS - forced authdns-gen-zones etc from https://phabricator.wikimedia.org/T97051#1994679 on ns0/ns1/ns2 to get new language added |
[production] |
20:59 |
<gehel> |
restarting elasticsearch server elastic1027.eqiad.wmnet (T110236) |
[production] |
19:56 |
<urandom> |
(Re)starting cleanup on restbase1009-{a,b}.eqiad.wmnet |
[production] |
19:56 |
<catrope@tin> |
Synchronized php-1.27.0-wmf.22/extensions/CentralNotice/: T133971 (duration: 00m 41s) |
[production] |
19:29 |
<gehel> |
restarting elasticsearch server elastic1026.eqiad.wmnet (T110236) |
[production] |
19:07 |
<gehel> |
restarting elasticsearch server elastic1025.eqiad.wmnet (T110236) |
[production] |
18:21 |
<jzerebecki@tin> |
Synchronized php-1.27.0-wmf.22/extensions/Wikidata/extensions/Wikibase/repo/includes/Hooks/OutputPageBeforeHTMLHookHandler.php: wmf.22 fc20c54f7915b94ec0d15ef17e207c116910623d 2 of 2 T132645 (duration: 00m 28s) |
[production] |
18:20 |
<jzerebecki@tin> |
Synchronized php-1.27.0-wmf.22/extensions/Wikidata/extensions/Wikibase/repo/includes/Dumpers/DumpGenerator.php: wmf.22 fc20c54f7915b94ec0d15ef17e207c116910623d 1 of 2 T133924 (duration: 00m 29s) |
[production] |
18:14 |
<jzerebecki@tin> |
Synchronized php-1.27.0-wmf.22/extensions/Wikidata/extensions/Wikibase/repo/includes/Hooks/OutputPageBeforeHTMLHookHandler.php: wmf.22 fc20c54f7915b94ec0d15ef17e207c116910623d 2 of 2 T132645 (duration: 00m 34s) |
[production] |
18:14 |
<robh> |
started all slaves via dbstore2001 this time. |
[production] |
18:12 |
<jzerebecki@tin> |
Synchronized php-1.27.0-wmf.22/extensions/Wikidata/extensions/Wikibase/repo/includes/Dumpers/DumpGenerator.php: wmf.22 fc20c54f7915b94ec0d15ef17e207c116910623d 1 of 2 T133924 (duration: 00m 44s) |
[production] |
18:07 |
<robh> |
started all slaves via dbstore2002 per jaime's request |
[production] |
17:45 |
<gehel> |
restarting elasticsearch server elastic1024.eqiad.wmnet (T110236) |
[production] |
16:56 |
<gehel> |
restarting elasticsearch server elastic1023.eqiad.wmnet (T110236) |
[production] |
16:37 |
<jzerebecki> |
restarting zuul for 4e9d180..ebb191f |
[releng] |
16:22 |
<gehel> |
restarting elasticsearch server elastic1022.eqiad.wmnet (T110236) |
[production] |
15:45 |
<hashar> |
integration: deleting integration-trusty-1026 and cache-rsync . Maybe that will clear them up from Shinken |
[releng] |
15:29 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Repool db2047 and db2068. Depool db2008, db2009. Pool db2033 as the new x1 node. (duration: 00m 27s) |
[production] |
15:17 |
<gehel> |
restarting elasticsearch server elastic1021.eqiad.wmnet (T110236) |
[production] |
15:14 |
<hashar> |
integration: created 'cache-rsync' and 'integration-trusty-1026' , attempting to have Shinken to deprovision them |
[releng] |
14:56 |
<oblivian@palladium> |
conftool action : set/pooled=yes; selector: name=mw1153.eqiad.wmnet |
[production] |
14:54 |
<jynus> |
moving topology of db2033 to be the new x1 master on codfw |
[production] |
14:40 |
<oblivian@palladium> |
conftool action : set/pooled=no; selector: name=mw1153.eqiad.wmnet |
[production] |
14:32 |
<gehel> |
restarting elasticsearch server elastic1020.eqiad.wmnet (T110236) |
[production] |