2016-07-04
§
|
20:38 |
<joal> |
Insert monitoring test data into cassandra on hosts aqs100[456] to prevent icinga alarms |
[analytics] |
20:38 |
<joal> |
Insert manitoring testto make tests pass |
[analytics] |
20:28 |
<jynus> |
removing /tmp/joal/sstables on all analytics10* hosts |
[production] |
20:22 |
<jynus> |
deleted 21GB worth of temporary files from analytics1050 |
[production] |
19:58 |
<aaron@tin> |
Synchronized wmf-config/filebackend-production.php: Increase redis lockmanager timeout to 2 (duration: 00m 31s) |
[production] |
19:57 |
<legoktm@tin> |
Synchronized php-1.28.0-wmf.8/extensions/MassMessage/: MassMessage is no longer accepting lists in the MassMessageList content model - T139303 (duration: 00m 39s) |
[production] |
18:58 |
<hashar> |
Upgrading arcanist on permanent CI slaves since xhpast was broken T137770 |
[releng] |
17:37 |
<jynus> |
testing slave_parallel_threads=5 on db1073 |
[production] |
14:27 |
<moritzm> |
rebooting lithium for kernel update |
[production] |
14:22 |
<moritzm> |
installing tomcat7/ libservlet3.0-java security update on the kafka brokers |
[production] |
14:06 |
<_joe_> |
shutting down mw1001-1008 for decommissioning |
[production] |
14:03 |
<gehel> |
rolling restart of elasticsearch codfw/eqiad for kernel upgrade (T138811) |
[production] |
13:47 |
<_joe_> |
stopping jobrunner on mw1011-16 as well, befor decommissioning |
[production] |
13:46 |
<moritzm> |
depooling mw1153-mw1160 (trusty image scalers), replaced by mw1291-mw1298 (jessie image scalers) |
[production] |
13:44 |
<godog> |
ack all mr1-codfw related alerts in librenms |
[production] |
13:43 |
<akosiaris> |
restart smokeping on netmon1001, temporarily disabled msw1-codfw |
[production] |
13:38 |
<gehel> |
resuming writes on Cirrus / elasticsearch, this did not speedup cluster recovery |
[production] |
13:30 |
<paladox> |
installing Gearman plugin in jenkins on gerrit-test instance |
[git] |
13:18 |
<godog> |
bounce redis on rcs1001 |
[production] |
13:16 |
<gehel> |
restarting elastic1021 for kernel upgrade (T138811) |
[production] |
13:07 |
<elukey> |
Bootstrapping again Cassandra on aqs100[456] (rack awareness + 2.2.6 - testing environment) |
[production] |
13:02 |
<gehel> |
pausing writes on Cirrus / elasticsearch for faster cluster restart |
[production] |
12:50 |
<yuvipanda> |
migrating deployment-tin to labvirt1011 |
[releng] |
12:43 |
<hashar> |
Nodepool back up with 10 instances (instead of 20) to accomodate for labs capacity T139285 |
[production] |
12:39 |
<godog> |
nodetool-b stop -- COMPACTION on restbase1014 |
[production] |
12:37 |
<yuvipanda> |
migrate test-prometheus2 to labvirt1011 |
[monitoring] |
12:33 |
<yuvipanda> |
reduced instances quota to 10 before starting it back up for T139285 |
[contintcloud] |
12:29 |
<moritzm> |
rolling reboot of rcs* cluster for kernel security update |
[production] |
12:10 |
<moritzm> |
rolling reboot of ocg* cluster for kernel security update |
[production] |
11:46 |
<paladox> |
finished migration (Disabled from loading in apache2 for now) will need to be added in sites-e* now deleting phab-03 instance. |
[phabricator] |
11:41 |
<paladox> |
sorry i am migrating it to phab-05 |
[phabricator] |
11:40 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Failover db1053 to db1072 (duration: 00m 40s) |
[production] |
11:39 |
<yuvipanda> |
delete project, is no longer used |
[zulip] |
11:39 |
<paladox> |
migrating 50-phabricator.conf from phab-03 to phab-02. |
[phabricator] |
11:39 |
<tom29739> |
deleted 4 instances that are not being used right now. |
[privpol-captcha] |
11:38 |
<paladox> |
deleting phab-03 instance. To test git redirects please install them on phab-02 instance or git-redirect-01 instance. Reason labs out of space and we can use the rules on the same instance without needing seperate one. |
[phabricator] |
11:37 |
<yuvipanda> |
delete zulip-01, unused |
[zulip] |
11:34 |
<paladox> |
deleting git-phab4 instance not needed and will free space in labs. |
[git] |
11:28 |
<yuvipanda> |
deleted project (after verifying with jynus) |
[ops-db-candidates] |
11:28 |
<yuvipanda> |
deleted project |
[marathon-eval] |
11:18 |
<yuvipanda> |
deleted project |
[marathon] |
11:15 |
<yuvipanda> |
delete instance waah, was totally unused for a long time |
[marathon] |
11:14 |
<yuvipanda> |
stop instance papaultest to free up some resources on labvirt1006 |
[testlabs] |
11:14 |
<yuvipanda> |
stop instance tool-master-02 to free up some resources on labvirt1006 |
[testlabs] |
11:13 |
<yuvipanda> |
delete tools-prometheus-01 to free up resources on labvirt1010 |
[tools] |
11:11 |
<yuvipanda> |
actually deleted instance tools-cron-02 to free up resources on labvirt1010 - was large and not currently used, and failover process takes a while anyway, so we can recreate if needed |
[tools] |