2015-12-05
§
|
18:30 |
<gwicke> |
started nodetool decommission on restbase1008 |
[production] |
11:35 |
<reedy@tin> |
Synchronized wmf-config/CommonSettings.php: Disable common password password policy to come in wmf.8 (duration: 00m 28s) |
[production] |
11:23 |
<reedy@tin> |
Purged l10n cache for 1.27.0-wmf.5 |
[production] |
11:22 |
<reedy@tin> |
Synchronized php-1.27.0-wmf.7/extensions/WikimediaMaintenance/refreshMessageBlobs.php: Less waiting for slaves (duration: 00m 28s) |
[production] |
11:13 |
<reedy@tin> |
Synchronized docroot and w: Add jobqueue-labs to noc (duration: 00m 28s) |
[production] |
08:59 |
<bblack> |
offlined db1019 megacli disk 32:11 |
[production] |
06:09 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Sat Dec 5 06:09:07 UTC 2015 (duration 3h 44m 18s) |
[production] |
02:24 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.7) (duration: 09m 59s) |
[production] |
2015-12-04
§
|
21:44 |
<andrewbogott> |
disabling puppet on labcontrol1002 for ldap testing |
[production] |
21:36 |
<ori@tin> |
Synchronized php-1.27.0-wmf.7/includes/Hooks.php: Iba0138a: Don't install a custom error handler for hooks (T117553) (duration: 00m 28s) |
[production] |
20:28 |
<ori@tin> |
Synchronized wmf-config/jobqueue-eqiad.php: Idee6a1980: job queue: use instances on port 6378 as aggregators (duration: 00m 30s) |
[production] |
19:21 |
<ori> |
krypton: updated Grafana to 2.6.0-beta1 for bug fix for issue 3422 |
[production] |
15:52 |
<Jeff_Green> |
add mx record for donate.wikimedia.org |
[production] |
15:33 |
<godog> |
ms-be2019 rebooted by itself, ilo event log shows "Uncorrectable Machine Check Exception (Board 0, Processor 2, APIC ID 0x00000038, Bank 0x00000003, Status 0xFE000040'00020135, Address 0x00000000'FEB82F63, Misc 0x00000000'00002285)" |
[production] |
08:52 |
<godog> |
reimage restbase1009 |
[production] |
05:59 |
<gwicke> |
ran systemctl mask cassandra on restbase1009; it is important that this node does not start up. |
[production] |
05:53 |
<gwicke> |
moved /var/lib/cassandra out of the way in an attempt to stop puppet restarting cassandra on decommissioned restbase1009 |
[production] |
05:49 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Fri Dec 4 05:49:46 UTC 2015 (duration 3h 21m 36s) |
[production] |
02:28 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.7) (duration: 10m 19s) |
[production] |
02:15 |
<ori> |
CirrusSearch-common.php sync was for I826d000ca: Turn off backoff throttling of CirrusSearch jobs |
[production] |
02:15 |
<ori@tin> |
Synchronized wmf-config/CirrusSearch-common.php: (no message) (duration: 00m 29s) |
[production] |
01:33 |
<bd808> |
Updated scholarships.wikimedia.org to af73bf6 |
[production] |
00:35 |
<catrope@tin> |
Synchronized php-1.27.0-wmf.7/extensions/CentralNotice: SWAT (duration: 00m 32s) |
[production] |
2015-12-03
§
|
23:09 |
<bblack> |
restarting pybal (w/ BGP enabled) on lvs100[123] (newly-installed w/ jessie) |
[production] |
22:59 |
<ori@tin> |
Synchronized php-1.27.0-wmf.7/includes/jobqueue/JobRunner.php: temporarily disable job throttling (duration: 00m 29s) |
[production] |
22:08 |
<bd808> |
Removed zirconium.wikimedia.org from Trebuchet minions list for scholarships/scholarships |
[production] |
22:04 |
<bd808> |
Updated scholarships.wikimedia.org to cb94319 plus local i18n filtering |
[production] |
21:48 |
<Reedy> |
finished removing bogus msg_resource rows |
[production] |
21:28 |
<oblivian@tin> |
Synchronized wmf-config/CommonSettings.php: re-sync (re-merged the change) (duration: 00m 29s) |
[production] |
21:27 |
<bd808> |
Applied database migrations and purged last year's data from Wikimania Scholarships db |
[production] |
21:21 |
<ottomata> |
restarted eventlogging with 4 mysql consumer processes running in parallel |
[production] |
21:21 |
<bblack> |
rebooting lvs100[123] for reinstall to jessie |
[production] |
21:18 |
<Reedy> |
Cleaning up msg_resource rows with bogus language codes |
[production] |
21:15 |
<gwicke> |
stopped cassandra on 1009 as it's decommissioned & will be reimaged |
[production] |
21:13 |
<oblivian@tin> |
Synchronized wmf-config/CommonSettings.php: Re-fix the jobqueue on wikitech after redis cleanup (duration: 00m 26s) |
[production] |
20:55 |
<oblivian@tin> |
Synchronized wmf-config/CommonSettings.php: Fix the jobqueue on wikitech (duration: 00m 47s) |
[production] |
20:45 |
<_joe_> |
opening connection from mw1001 to silver, mysql |
[production] |
20:29 |
<ori> |
on palladium: salt -G 'cluster:jobrunner' cmd.run 'service jobrunner status | grep running && service jobrunner restart' ; salt -G 'cluster:jobrunner' cmd.run 'service jobchron status | grep running && service jobchron restart' |
[production] |
20:28 |
<ori> |
ran srem jobqueue:aggregator:s-wikis:v2 labswiki on rdb1001 aggr |
[production] |
19:41 |
<bblack> |
disabling pybal on lvs100[123] over the next few minutes (for reinstall to jessie later after confirmation everything is still ok on [456]) |
[production] |
19:10 |
<jynus> |
restarting eventlogging_sync on db1047 and dbstore1002 |
[production] |
19:04 |
<jynus> |
starting m4 slave again on dbstore2002 |
[production] |
18:45 |
<andrewbogott> |
disabling puppet on labcontrol1002 to test openldap with pdns |
[production] |
18:33 |
<mutante> |
neon - remove icinga user from "dialout" group |
[production] |
18:27 |
<jynus> |
disabling eventlogging_sync process on dbstore1002 and db1047 and replication on the other m4 slaves |
[production] |
18:18 |
<jynus> |
disabling event scheduler on db1046 (m4-master) |
[production] |
17:03 |
<kartik@tin> |
Finished scap: Update ContentTranslation (duration: 05m 52s) |
[production] |
16:57 |
<kartik@tin> |
Started scap: Update ContentTranslation |
[production] |
16:50 |
<oblivian@tin> |
Synchronized wmf-config/CommonSettings.php: Fix the jobqueue on wikitech (duration: 00m 28s) |
[production] |