2016-05-02
§
|
19:17 |
<mutante> |
manually removing 2fa from my own wikitech account, adding it back .. |
[production] |
18:24 |
<gehel> |
deploying latest WDQS version |
[production] |
17:23 |
<robh> |
restbase2004 offline for next few hours for comparison work for new systems T132976 |
[production] |
16:01 |
<krenair@tin> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/286286/ (duration: 00m 26s) |
[production] |
15:53 |
<krenair@tin> |
Synchronized wikiversions-labs.json: https://gerrit.wikimedia.org/r/#/c/283689/ (duration: 00m 25s) |
[production] |
15:53 |
<krenair@tin> |
Synchronized dblists/all-labs.dblist: https://gerrit.wikimedia.org/r/#/c/283689/ (duration: 00m 26s) |
[production] |
15:44 |
<krenair@tin> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/286287/ (duration: 00m 25s) |
[production] |
15:40 |
<krenair@tin> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/286285/ (duration: 00m 25s) |
[production] |
15:32 |
<krenair@tin> |
Synchronized php-1.27.0-wmf.22/extensions/Wikidata: https://gerrit.wikimedia.org/r/#/c/286434/2 (duration: 02m 02s) |
[production] |
15:28 |
<bblack> |
re-pooling esams |
[production] |
15:22 |
<jynus> |
restarting db1040 for reimage |
[production] |
15:21 |
<krenair@tin> |
Synchronized php-1.27.0-wmf.22/extensions/Math/MathRestbaseInterface.php: https://gerrit.wikimedia.org/r/#/c/286412/ (duration: 00m 26s) |
[production] |
15:07 |
<krenair@tin> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/285700/ (duration: 00m 42s) |
[production] |
14:52 |
<moritzm> |
rolling restart of zookeeper to pick up Java update |
[production] |
14:22 |
<bblack> |
starting gdnsd on esams (esams is marked down there) |
[production] |
14:20 |
<bblack> |
stopped gdnsd on eeden |
[production] |
13:13 |
<jynus> |
stopping db1040 mysql for backup before cloning |
[production] |
12:15 |
<elukey> |
deployed Varnish change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage. |
[production] |
12:13 |
<elukey> |
deployed Varnish cache::misc change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage. |
[production] |
12:12 |
<elukey> |
Merged Varnish cache::misc change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage. |
[production] |
11:21 |
<elukey> |
deployed the last version of Event Logging from tin. Service also restarted. |
[production] |
11:06 |
<moritzm> |
rolling restart of hhvm in eqiad for pcre security update |
[production] |
10:42 |
<moritzm> |
rolling restart of hhvm in codfw for pcre security update |
[production] |
09:58 |
<moritzm> |
uploaded openldap 2.4.41+wmf1 for jessie-wikimedia to carbon (T130593) |
[production] |
08:14 |
<hashar> |
Restarted stuck Jenkins (due to IRC plugin) |
[production] |
07:44 |
<moritzm> |
rebooting hasseleh/hassium for kernel upgrade to 4.4 |
[production] |
07:10 |
<moritzm> |
installing poppler security updates |
[production] |
06:46 |
<_joe_> |
rebooting serpens from ganeti, unreachable |
[production] |
02:30 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Mon May 2 02:30:33 UTC 2016 (duration 9m 18s) |
[production] |
02:21 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.22) (duration: 09m 31s) |
[production] |
2016-05-01
§
|
19:37 |
<SMalyshev> |
enabled wdqs1002, put wdqs1001 in maintenance mode for reload |
[production] |
16:20 |
<volans> |
changing live configuration of db1042 thread_pool_stall_limit to 10 to avoid connection timeout errors |
[production] |
16:18 |
<volans> |
changing live configuration of db1042 thread_pool_stall_limit back to 100 to test impact on connection timeout |
[production] |
16:08 |
<volans> |
changing live configuration of db1042 thread_pool_stall_limit to 10 to test impact on connection timout |
[production] |
15:24 |
<jynus> |
alter table puppet.fact_values to a bigint unsigned for m1 T107753 |
[production] |
15:07 |
<volans@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1040 for investigation T134114 (duration: 01m 22s) |
[production] |
14:44 |
<volans> |
truncated puppet.fact_values table to fix puppet (as documented on wikitech) |
[production] |
10:58 |
<godog> |
reboot furud.codfw.wmnet, ganeti instance with increasing load and 100% iowait, kvm/ganeti idle instance bug likely T134098 |
[production] |
2016-04-30
§
|
13:41 |
<elukey> |
disabled puppet on analytics1047 and scheduled downtime for the host, IO errors in the dmesg for /dev/sdd. Stopped also Hadoop daemons to remove it from the cluster temporarily (not sure how to do it properly, will write docs). |
[production] |
10:45 |
<volans> |
Reset slave on sanitarium:3311 due to corrupted relay log after skipping query for duplicate key T132416 |
[production] |
10:19 |
<volans> |
restarted slave on dbstore1001 skipping missing database T132837 |
[production] |
08:28 |
<gehel> |
restarting elasticsearch server elastic1031.eqiad.wmnet (T110236) |
[production] |
07:15 |
<gehel> |
restarting elasticsearch server elastic1030.eqiad.wmnet (T110236) |
[production] |
06:32 |
<gehel> |
restarting elasticsearch server elastic1029.eqiad.wmnet (T110236) |
[production] |
06:16 |
<gehel> |
restarting elasticsearch server elastic1028.eqiad.wmnet (T110236) |
[production] |
01:15 |
<aude> |
applied Ibd302e1 to terbium for debugging broken wikidata rdf dumps |
[production] |